<![CDATA[breq.dev]]> https://breq.dev https://breq.dev/rss.png breq.dev https://breq.dev node-rss Wed, 14 Jan 2026 14:48:58 GMT Wed, 14 Jan 2026 14:48:58 GMT <![CDATA[Infinite Coffee Glitch]]>

TL;DR

terminal.shop is an online store selling coffee beans via an SSH-based interface. They also provide an API allowing users to place orders via HTTP requests. The /order endpoint had inadequate validation, allowing for nonsensical orders including empty carts, items with quantity 0, and more interestingly: the ability to get coffee for free by adding negative quantity items to your cart. All you had to do is call this endpoint with a quantity of -1 for an item and $-22 would be deducted from your order total.

The vulnerability was responsibly disclosed and has been patched. (But even though you can no longer get coffee for free, please still check them out as they sell great products at reasonable prices!)

My girlfriends and I are big coffee enjoyers, and since getting an espresso machine in 2024, we've gotten deep into trying different types of coffee beans in lattes and other drinks. So when I stumbled upon terminal.shop last summer, I figured it would be a fun gimmick and a way to try out some different beans. The process is pretty simple: SSH into terminal.shop and use your keyboard to navigate the menus and order yourself some coffee.

Running a storefront over SSH works really well! Pages load almost instantly (since they're just an 80x24 grid of characters). While you don't have the full abilities of CSS, they've made extensive use of ANSI codes to create an aesthetically pleasing interface. Unlike HTTPS, SSH doesn't have public key infrastructure -- there's no certificate authority asserting that the terminal.shop server is legitimate. However, they put their SSH public key on their website so you can verify it yourself, and once you login the first time, your client will store the fingerprint to ensure the server is legitimate on future logins.

While we placed our first order because buying something over SSH sounded too fun to pass up, we ended up really liking the decaf and dark roast. However, once we become semi-regular customers, ordering via SSH lost its initial novelty, and I wanted to try something new.

None Coffee with Left Shipping

terminal.shop provides an HTTP API that allows you to place orders. They also have a bunch of client libraries, but I found the bare HTTP documentation easier to read, so I set out to write a Bash script so I could place orders from my terminal without the hassle of a TUI. My goal was to have one command I could run any time we needed to stock up.

Unfortunately, the development backend at api.dev.terminal.shop was down when I tried to test my code, so I sighed, took out my real credit card, crossed my fingers and hit enter. I immediately got a notification that I had been charged $8 and hadn't received a confirmation email. After a bit more digging, I realized where I had gone wrong -- I had supplied the product ID of each product I wanted to buy instead of the product variant ID, causing the order to be created with an empty cart, myself to be billed $8 for shipping and $0 for my nonexistent items, and then (presumably) the backend crashing later when trying to generate my order confirmation.

Immediately after I realized this, I sent an email to support@terminal.shop. I didn't get a response and, for a time, resigned myself to the fact that I had paid $8 for an interesting story. (I don't blame them for not getting back to me, Terminal Products seems to be a side project of a few content creators and I'm sure that job can be incredibly hectic.)

Bash hacking

I patched my code and ordered again. This time, it was a success! However, something was up with the packing slip...

All four items in the shop were present in both the email and the packing slip, but the two I did not order had a quantity set to zero. This was a byproduct of how my ordering script worked: instead of adding each product to the payload when a user ordered it, I always put all four available variant IDs into the payload with a quantity set to zero unless that argument was set to zero.

#!/bin/bash
# usage: ./order.sh --segfault 1 --404 1
OBJECT_OBJECT_QTY=0
SEGFAULT_QTY=0
DARK_MODE_QTY=0
_404_QTY=0
while [[ $# -gt 0 ]]; do
case "$1" in
--object-object)
OBJECT_OBJECT_QTY=$2; shift 2 ;;
--segfault)
SEGFAULT_QTY=$2; shift 2 ;;
--dark-mode)
DARK_MODE_QTY=$2; shift 2 ;;
--404)
_404_QTY=$2; shift 2 ;;
*)
echo "Unknown option: $1"; exit 1 ;;
esac
done
# ...
ORDER_PAYLOAD=$(jq -n '{
addressID: "'$SAVED_ADDRESS_ID'",
cardID: "'$SAVED_CARD_ID'",
variants: {
"'$ITEM_OBJECT_OBJECT'": '$OBJECT_OBJECT_QTY',
"'$ITEM_SEGFAULT'": '$SEGFAULT_QTY',
"'$ITEM_DARK_MODE'": '$DARK_MODE_QTY',
"'$ITEM_404'": '$_404_QTY',
}
}')

This was a purely arbitrary choice I made since I thought the ergonomics of doing it this way was easier in Bash versus trying to conditionally add things to the JSON payload. I figured that the backend would just filter out any items with a quantity of zero before the order processed. But it seemed like those "quantity zero" items were still there!

Just how much does this endpoint allow?

An idea dawned on us. It seemed like this endpoint had pretty poor validation overall. If we placed an order for the 2 types of coffee we wanted, then ordered -1 of a coffee we didn't care about, maybe we could receive these items for less money and thus get back the $8 we lost earlier!

We were almost certain that it wouldn't work, but I ran the script again:

./order.sh --404 1 --dark-mode 1 --segfault -1

Verified the payload it created:

{
"addressID": "shp_XXXXXXXXXXXXXXXXXXXXXXXXXX",
"cardID": "crd_XXXXXXXXXXXXXXXXXXXXXXXXXX",
"variants": {
"var_01J1JFE53306NT180RC4HGPWH8": 0,
"var_01J1JFDMNBXB5GJCQF6C3AEBCQ": -1,
"var_01J1JFF4D5PBGT0W2RJ7FREHRR": 1,
"var_01J1JFEP8WXK5MKXNBTR2FJ1YC": 1
}
}

Then sent it off, and watched the response come in:

{ "data": "ord_XXXXXXXXXXXXXXXXXXXXXXXXXX" }

We successfully created an order! Just like that, we got a notification that the card had been charged $30! Then, checking my email, I saw the order confirmation also listed payment of $30: 1 bag of 404 at $22, 1 bag of Dark Mode at $22, 0 bags of [object Object] at $0, and -1 bags of Segfault for $-22.

We waited patiently for the tracking number. Once it arrived, we checked the metadata, and the shipping label listed the package as weighing 12 oz -- the weight of one bag. It would seem, somehow, our order got corrected and we were receiving only the one bag we had paid for.

Until... a box showed up outside our apartment! I picked it up and noticed it weighed far more than a single bag. Once I opened it, I realized we had been sent all four types of coffee!

It appears the order was able to make it all the way to the human fulfilling it without being filtered out, who then probably saw the packing list, thought "oh, I guess the quantity column is messed up on this one," and packed and shipped it off.

It quickly dawned on us what we had actually done: paid $30 for $88 worth of coffee and put Terminal Products in the position of shipping a 48 oz package with a label that said 12 oz. Not ideal!

Aftermath

As it turns out, my girlfriend Ava realized that she unexpectedly shares a mutual acquaintance with some of the terminal.shop team, so word reached them pretty quickly.

Thankfully, the folks at Terminal were super chill about this and quickly patched the issue. Not only do I get to keep four times as much coffee as I expected, but we were given a reward for reporting this in the form of even more coffee. (Friends in Boston, please hit me up if you want a latte sometime!)

From what I can tell, everything exploitable via bare HTTP requests would be similar to accomplish with the official client libraries. The only "trick" was using the API, calling the /order endpoint directly instead of adding items through /cart/item, and just putting weird stuff in the variants field.

It was fun to watch the hypotheses come in once this was announced online: was it an obscure SSH feature? Something with SendEnv? Terminal control characters?

These explanations were all unlikely. You might think terminal.shop is built on a traditional server like OpenSSH, with the login shell set to a TUI program, which would thus leave it open to bugs or misconfigurations in a wide range of obscure SSH features. However, the answer is much nicer -- the terminal.shop TUI is built with Wish, a framework by Charm that allows you to create apps accessible over SSH without ever creating an actual shell. It's the same framework I used for fissh.breq.dev, a tiny app I made about a year ago with Ava that presents you with an ASCII drawing of a fish every day at 11:11.

Discovering this vulnerability was a long adventure in the making! I want to thank my girlfriends Ava and Mia for encouraging me and offering advice, AJ Stuyvenberg for getting us connected to the team at Terminal, and of course all the folks at Terminal Products for having an open and positive attitude towards security research. Sometimes, the most powerful bugs are the ones that require the least complicated exploits!

]]>
https://breq.dev/2026/01/14/infinite-coffee-glitch /2026/01/14/infinite-coffee-glitch Wed, 14 Jan 2026 00:00:00 GMT
<![CDATA[Parsing historical MBTA data]]>

Transit and Data

Transit systems and public data are a great match. In my daily life, I interact with so many devices and applications which pull from transit data, from the LED matrix in my living room, to the app on my phone, and to the countdown clocks within the station itself. It's also incredibly accessible, too, as the myriad of DIY projects pulling in transit data demonstrates.

There are, in general, two types of data about a transit system: what the system is in theory (schedules, routes, and stations determined months in advance), and what the system is in practice (vehicle locations, arrival predictions, and dropped/added trips updated in realtime).

Transit Systems in Theory

Describing what a transit system does in theory is the easy part. The General Transit Feed Specification defines a common format made up (in true 2006 fashion) of a set of TXT files contained within a ZIP file, each describing a different aspect of the system. For instance, stops.txt describes the vehicle stops and stations in the system, stop_times.txt describes when each stop is serviced by a vehicle trip, and trips.txt can specify the train number, whether bikes are aloud, and more.

The benefit of this approach is clear: sharing a common standard means that code written for one city can work seamlessly with others. Smaller transit operators can create these files by hand, and larger ones can build up the necessary automation to handle hundreds of routes and thousands of stops. Since these ZIP files usually only change when new schedules are determined, distributing them is straightforward and storing historical ones is easy to do.

Transit Systems in Practice

Realtime data is where things get messy. The requirements are more demanding. Data usually needs to be generated and distributed without a human in the loop, and clients need to pull updates every minute or faster.

GTFS does offer a solution to this in the form of GTFS-Realtime. However, being a Google initiative, they chose to build this using Protobuf as an interchange format, which requires specific language bindings to work with. My home system, the MBTA, chose to offer a JSON version of their feeds as well.

Even still, I tend to use their excellent, well-documented service API which makes it easier to ingest only the data relevant to the lines and stations I need. Interoperability with other systems is usually not a priority for my projects.

GTFS-Realtime, however, does not specify how historical data should be represented and stored, leaving transit systems to invent bespoke formats for this data -- that is, if they choose to make it available at all.

Historical Data at the MBTA

LAMP

The MBTA has a service called LAMP (Lightweight Application for Measuring Performance), which does three things:

  1. Publish historical GTFS schedule data, showing the state of the system "in theory" for any arbitrary date since 2009.
  2. Publish historical subway performance data (the system "in practice") sorted by service date.
  3. Publish miscellaneous datasets for MBTA-internal uses (while these are public, they are entirely undocumented).

That second point is what we'll focus on for parsing historical realtime data.

Open Data Portal

The MBTA Open Data Portal contains lots of additional reports generated by the MBTA covering ridership, predictions accuracy, and more across the various transit modes. One such dataset is the Bus Arrival Departure Times 2025 dataset, which nicely complements the subway data published by the LAMP team.

Let's get parsing

Subway data

Data format

The subway data we're interested in is distributed in the Parquet format, using URLs like this for each day of service:

https://performancedata.mbta.com/lamp/subway-on-time-performance-v1/YYYY-MM-DD-subway-on-time-performance-v1.parquet

The Parquet file format is an Apache project specification for storing tabular data efficiently. They pack the column data efficiently but don't require the receiver to know the schema definition like Protobuf does, which means we can easily throw these files into Pandas, the popular Python data analysis library.

import pandas as pd
path = "https://performancedata.mbta.com/lamp/subway-on-time-performance-v1/2025-10-31-subway-on-time-performance-v1.parquet"
df = pd.read_parquet(path)
with open("data.json", "w") as out:
out.write(df.to_json(orient="records"))

We can see that the data has 27 columns. While these aren't documented anywhere, here's how I assume the data is structured:

  • Each entry describes a vehicle arriving and/or departing a station as part of a revenue trip.
  • stop_id and parent_station describe which platform and station the vehicle was at: stop_id identifies the platform (for instance, 70513 means the northbound platform at East Somerville), and parent_station describes the station it belongs to (in this case, place-esomr).
  • move_timestamp seems to be when the train starting moving towards the given station, and stop_timestamp is when it reached the station.
  • travel_time_seconds seems to be the amount of time it took the train to reach the given station, and dwell_time_seconds is how long it spent there.
  • service_date describes the service date as an integer with a decimal expansion of the form YYYYMMDD... bruh
  • route_id defines the specific route, such as Blue, Green-E, or Red. branch_route_id defines the branch, such as Blue (no branching), Green-E, or Red-A. I am unsure why the Red line branching is treated differently than the Green line here.
  • direction_id is either true or false, depending on which way the train is heading. direction is the human-readable name, like South. direction_destination is the direction given as a destination station or station pair, like Ashmont/Braintree or Boston College.
  • start_time seems to be the time that the vehicle started moving, given as "seconds since midnight at the start of the service day". Since the MBTA defines "service days" as starting and ending around 3 AM, the first vehicles have a start_time of around 17680 (4:54 AM) and the last ones have a start_time of around 95976 (2:39 AM). stop_count is the number of stops that vehicle has made since that time, I guess?
  • vehicle_id is a unique identifier for the vehicle, vehicle_label is a human-readable label (usually the number of the first one or two cars), and vehicle_consist is the car numbers of each car in the train.
  • trip_id identifies the trip that the vehicle was on.

Simulating trips

Suppose I had been on the southbound platform at Sullivan Station at 5:15 PM. When would I have made it to North Station?

Let's start by finding all the trips which arrived at North Station coming southbound.

trips_to_target = df[(df["parent_station"] == "place-north") & (df["direction"] == "South")]
Stop IDParent StationMove TimestampStop TimestampRoute IDDirectionTrip ID
70026place-north1761902928.01761903013.0OrangeSouth70525780
70026place-north1761903114.01761903199.0OrangeSouth70525786
70026place-north1761903280.01761903367.0OrangeSouth70525792
70026place-north1761903673.01761903759.0OrangeSouth70525798

Now, let's find the earliest one of those trips which also stopped at Sullivan.

trip_ids = trips_to_target["trip_id"].unique()
trips_from_start = df[
(df["parent_station"] == "place-sull")
& (df["direction"] == "South")
& (df["trip_id"].isin(trip_ids))
]
Stop IDParent StationMove TimestampStop TimestampRoute IDDirectionTrip ID
70030place-sull1761902684.01761902737.0OrangeSouth70525780
70030place-sull1761902876.01761902929.0OrangeSouth70525786
70030place-sull1761903061.01761903110.0OrangeSouth70525792
70030place-sull1761903454.01761903502.0OrangeSouth70525798

Right now, most of these are the same trip IDs, but now the timestamps match up with the train's stop at Sullivan. If we were dealing with different branches of the Green Line, for instance, this step would also filter out trips which don't run between our station pair.

Finally, let's find the first trip from the start which departed after we got to the station.

Times in this data are represented as Unix timestamps (i.e., seconds since 1970), seemingly in the local timezone (U.S. Eastern time). So, for instance, the first trip in our list arrived at the station at 1761902737 seconds, or at 5:25:37 AM on 2025-10-31.

import datetime
timestamp = datetime.datetime.fromisoformat("2025-10-31 17:15:00").timestamp()
trips_after_time = trips_from_start[trips_from_start["stop_timestamp"] > timestamp]
Stop IDParent StationMove TimestampStop TimestampRoute IDDirectionTrip ID
70030place-sull1761946007.01761946055.0OrangeSouth70526030
70030place-sull1761946114.01761946163.0OrangeSouth70526038
70030place-sull1761946438.01761946487.0OrangeSouth70526046
70030place-sull1761946646.01761946693.0OrangeSouth70526054

The first of those trips is trip ID 70526030, which arrived at Sullivan at 5:27:35 PM.

next_train = trips_after_time.loc[trips_after_time["stop_timestamp"].idxmin()]
train_arrival = df[
(df["trip_id"] == next_train["trip_id"])
& (df["parent_station"] == "place-north")
]

Looking up its arrival into North Station, we see:

Stop IDParent StationMove TimestampStop TimestampRoute IDDirectionTrip ID
70026place-north1761946237.01761946322.0OrangeSouth70526030

It appears we would have arrived at 5:32:02 PM.

Bus data

The bus data is a little different. The MBTA provides it as a ZIP file for each year, with a CSV file for each month. At time of writing, the latest one is MBTA-Bus-Arrival-Departure-Times_2025-09.csv.

The data is pretty straightforward, providing the following columns:

  • service_date is self-explanatory.
  • route_id is the bus number. Note that for some buses, the internal route ID does not match the consumer-facing bus number. For instance, route 89/93 is identified as 194 internally, and the first 39 bus of the day is considered route 192 internally.
  • direction_id is the direction identifier, which appears to only ever be Outbound or Inbound.
  • half_trip_id is like the trip_id of the subway data, but since the outbound and inbound trip of a bus route often share a trip ID, this disambiguates them.
  • stop_id is the identifier of the platform at which the bus stops. Large stations like Sullivan provide many bus platforms, each with its own ID.
  • point_type specifies if this stop is a Startpoint, Midpoint, or Endpoint of the route.
  • scheduled and actual give the scheduled and actual arrival times for the bus, respectively. However, for some reason, the date format is of the form 1900-01-01T14:28:00Z... We'll dig into this more later.

Notably, not every stop is covered in this data, only the time points (places along the route with a scheduled arrival time). However, the time points are typically placed at high-traffic stops like subway stations or the start and end of the line, so they are probably useful anyway.

Basic parsing

Let's see if we can repeat a similar "simulated journey" to what we did with the subway, but with the bus data. Suppose I'm trying to get from Harvard to Hynes Convention Center station using the 1 bus on 2025-09-20 at 11:00 AM.

Ingesting the CSV file is quite easy:

import pandas as pd
path = "MBTA_Bus_Arrival_Departure_Times_2025/MBTA-Bus-Arrival-Departure-Times_2025-09.csv"
df = pd.read_csv(path)

The first part is quite similar to simulating a subway trip. Let's try selecting all of the trips that arrived at our destination stop ID, stop 79, then find the records for those trips departing from Harvard, stop 110.

trips_to_target = df[
(df['stop_id'] == 79)
& (df['direction_id'] == "Inbound")
]
trip_ids = trips_to_target["half_trip_id"].unique()
trips_from_start = df[
(df["stop_id"] == 110)
& (df["direction_id"] == "Inbound")
& (df["half_trip_id"].isin(trip_ids))
]
Service DateRoute IDDirection IDHalf Trip IDStop IDScheduledActual
2025-09-0101Inbound680995701101900-01-01T11:31:00Z1900-01-01T11:44:24Z
2025-09-0101Inbound680995721101900-01-01T12:40:00Z1900-01-01T13:01:02Z
2025-09-0101Inbound680995731101900-01-01T22:06:00Z1900-01-01T22:54:09Z
2025-09-0101Inbound680995751101900-01-01T23:42:00Z1900-01-02T00:18:43Z

And finally, we need to filter by trips happening after our chosen start time, so we need to handle the problem of dates.

Dates and times nonsense

How do we assemble the service_date and scheduled/actual fields into an actual timestamp like we got with the subway data?

Let's start by parsing the service date.

timestamp = datetime.datetime.fromisoformat("2025-09-01")
>>> datetime.datetime(2025, 9, 1, 0, 0)

Now, let's parse the timestamp offset. Note that it is in UTC, not local time.

offset = datetime.datetime.fromisoformat("1900-01-01T11:31:00Z")
>>> datetime.datetime(1900, 1, 1, 11, 31, tzinfo=datetime.timezone.utc)

Right now, the offset is an aware datetime, meaning it contains timezone info. Combining aware datetimes with their counterparts, naive datetimes, is usually not allowed. Let's make the service date an aware datetime as well.

You might be tempted to use the astimezone method, but this will convert the naive datetime to an aware datetime assuming the naive datetime is in local time, which is not what we want -- we want to keep the year/month/day values the same, but just attach timezone information to this instance. We can use the replace method and replace the tzinfo field.

timestamp = timestamp.replace(tzinfo=datetime.UTC)
>>> datetime.datetime(2025, 9, 1, 0, 0, tzinfo=datetime.timezone.utc)

Cool. Now the only thing we need to do with the offset is subtract the placeholder date (1900-01-01). Python implements this nicely, where subtracting one datetime from another gives a timedelta object.

offset -= datetime.datetime(1900, 1, 1, tzinfo=datetime.UTC)
>>> datetime.timedelta(seconds=41460)

Now, we can add this to our service date:

timestamp += offset
>>> datetime.datetime(2025, 9, 1, 11, 31, tzinfo=datetime.timezone.utc)

And finally, convert it from UTC to our local time, then strip the timezone info to match the behavior of our other code. If this were production code, we would want to only use aware datetimes... but we're just messing around so let's do what's easy.

import zoneinfo
timestamp = timestamp.astimezone(zoneinfo.ZoneInfo("America/New_York"))
timestamp = timestamp.replace(tzinfo=None)
>>> datetime.datetime(2025, 9, 1, 7, 31)

Cool! So that trip was at 2025-09-01 at 7:31 AM.

Pulling it together

Okay, back to the show. We were trying to filter by trips happening after a given time. Let's add a column to our dataframe with a proper timestamp to match our other data.

import zoneinfo
def convert_timestamp(row):
if not isinstance(row["actual"], str):
return pd.NA
timestamp = datetime.datetime.fromisoformat(row["service_date"])
offset = datetime.datetime.fromisoformat(row["actual"])
timestamp = timestamp.replace(tzinfo=datetime.UTC)
offset -= datetime.datetime(1900, 1, 1, tzinfo=datetime.UTC)
timestamp += offset
timestamp = timestamp.astimezone(zoneinfo.ZoneInfo("America/New_York"))
timestamp = timestamp.replace(tzinfo=None)
return timestamp
df['timestamp'] = df.apply(convert_timestamp, axis=1)

After rebuilding trips_from_start with this new column, we can select the trips after our chosen time, choose the one that departed earliest, and see when it got to Hynes in a similar way to our subway trip simulator.

timestamp = datetime.datetime.fromisoformat("2025-09-20 11:00:00")
trips_after_time = trips_from_start[trips_from_start['timestamp'] > timestamp]
next_bus = trips_after_time.loc[trips_after_time['timestamp'].idxmin()]
bus_arrival = df[(df['half_trip_id'] == next_bus['half_trip_id']) & (df['stop_id'] == 79)]
Service DateRoute IDDirection IDHalf Trip IDStop IDTimestamp
2025-09-2001Inbound68311684792025-09-20 11:24:05

I would've made it to Hynes at 11:24 AM. Not bad!

Simulating longer journeys

Why do we care about simulating historical journeys, you might ask? It's useful for answering a few different types of questions:

  • How early should I leave my house to make it to work on time 90% of the time?
  • How reliably, on average, can I make certain connections between modes?

And, my favorite:

  • If I wanted to visit every subway station in the MBTA as quickly as possible, what would be the fastest route?

Simulating an existing speedrun

Transit system speedrunning is a phenomenon in which competitors attempt to travel through all stations within a system as quickly as possible.

My friend Tris completed a speedrun about a year ago and documented her route in a Mastodon thread. Let's see if we can accurately simulate it!

I made a few improvements to the algorithm:

  • Caching the processed data where possible!
  • Instead of using stop_timestamp + dwell_time_seconds, I used the move_timestamp of the station immediately after the given one. This seemed much more accurate for terminus stations where dwell_time_seconds was set to null (despite those stations being where the train dwells for the longest amount of time).

For parts of the run involving walking, I set very optimistic transfer times because I can vouch for the fact that Tris walks very fast.

The code for this is on GitHub if you want to play around with it.

StationActual TimeRouteDirectionSimulated Time
Riverside07:15Green-DEast07:15
KenmoreGreen-BWest07:48
Boston College08:18WALKWALK08:15
Cleveland Circle08:41Green-CEast08:25
Park St09:12Green-EWest09:07
Heath St09:4539Outbound09:42
Forest Hills09:58OrangeNorth09:54
Downtown Crossing10:21RedNorth10:17
Alewife10:50RedSouth10:43
Davis10:5496Outbound10:46
Medford/Tufts11:04Green-EWest11:07
East SomervilleWALKWALK11:16
Union Sq11:24Green-DWest11:21
North Sta11:36OrangeNorth11:36
Oak Grove12:05OrangeSouth12:03
HaymarketWALKWALK12:23
Bowdoin12:35BlueNorth12:28
Wonderland12:57BlueSouth12:50
StateWALKWALK13:08
Downtown Crossing13:17Red-ASouth13:13
Ashmont13:41MattapanOutbound13:38
Mattapan13:57MattapanInbound13:51
Ashmont14:08Red-ANorth14:05
JFK/UMass14:23Red-BSouth14:20
Braintree14:4714:45

Not too bad! Especially considering that the "actual time" column is based on Mastodon post timestamps so probably 1-2 minutes behind the actual timing.

Generating the optimal route

My eventual plan with this is to use it to learn more about what makes an ideal MBTA speedrun, including the route, timing, etc. It's been a bucket list item of mine for a while to do a speedrun, and I want to see if I can find something new in terms of route, timing, etc. I think it's unlikely I'll find anything that substantial that could make a difference, but maybe it's possible...

Since running this type of simulation becomes extremely fast, there are a lot of places you could take this other than just tracing manually entered routes:

  • Taking a graph representation of the MBTA system, running a search algorithm to identify a large set of possible routes, and running simulations on each to estimate real-world timing
  • Simulating a route at various times of day, days of the week, or even months of the year to find the best time to start a run
  • Implementing logic to simulate making decisions at each step of the route based on upcoming train times (e.g., whether an Ashmont or Braintree train arrives first), to evaluate when to make those decisions

Takeaways

I hadn't really done an in-depth data analysis project quite like this before. I definitely learned a ton!

Ingesting data into a data structure like this is something I usually consider "grunt work," but working on this showed me how wrong that assumption can be. I tried to throw a lot of this work to an LLM, only to watch it struggle against the date and timezone issues and completely miss the nuance between stop_timestamp and move_timestamp. While I got much further than it did (thankfully for the sake of my job security), it still required me to step away from the problem for a day before I could nail the accuracy.

I had also heard people constantly talk about how timezones are hard and date parsing is a mess, but had been spared the brunt of that struggle until now. I have discovered that there are a lot of bad ways to represent dates and times in software. I feel lucky that I do not need to deal with things like this more often.

The last thing I'll mention is that this project highlighted the difference between working with data in a domain where a dominant format exists (i.e., realtime GTFS), and working in a domain where implementors have no common format to use (i.e., historical transit data). I find it easy to get annoyed at format specifications not perfectly matching what I want to do... but when standardized formats work, they're pretty great!

]]>
https://breq.dev/2025/11/02/historical-mbta /2025/11/02/historical-mbta Sun, 02 Nov 2025 00:00:00 GMT
<![CDATA[Wall Matrix 3]]>

I've built several projects based on LED matrices: my first one in 2021, and a redesign earlier in 2025 with a larger panel and slimmer enclosure. I've also gotten to spend lots of time fine-tuning the display UI as I learn more about icon design, balancing readability and "glanceability" with information density, and dealing with the unique constraints of a display with a 3mm pixel pitch.

Motivation

While the hardware was a dramatic step up from my first iteration, I still thought there was room for improvement: more robust mounting of the encoder and the DC power jack, a slimmer construction that could fit closer to the wall, and simpler internal wiring. The obvious next step was a custom PCB. This project came at a time when I was hoping to get back into PCB design, so this seemed like the perfect opportunity!

At the same time, I had relatively recently started full-time work and wanted to add some decorations to my desk area. Since my previous builds had proven extremely useful at indicating bus times and bike availability, I figured that one on my work desk would be similarly useful! Of course, this one had slightly different requirements, so I ended up building a single board and two very different enclosures for it.

PCB Design

The schematic of the board is very similar to that of the previous build, with a few changes:

  • The Adafruit LED Matrix Bonnet was removed and its level shifting chips were placed directly onto the custom PCB
  • An ADC and photoresistor were added to enable automatic brightness dimming (a substantial pain point on the previous version!)
  • A USB-C port was added to the side of the board as an alternate power option, since using the DC jack on the bottom would prevent standing the display up on a table

The layout followed pretty naturally. I kept the board at 192mm wide to match either a 64-column 3mm-pitch display or a 32-column 6mm-pitch display, gave it rounded bottom corners to match the style of the previous build, and mounted the encoder at the center on one side.

Shameless plug: The renders in this section were made in my KiCAD SVG prettifier tool.

While I like the enclosed build in my apartment, I wanted to leave the option of doing a build with an exposed PCB. The components are all mounted towards the back, and the routing was mostly done only on the back side, leaving the front part of the board mostly empty on both the silkscreen and copper layers. I thought about putting some artwork or text on the silkscreen but couldn't think of a good way to make it look professional -- the usable space on the board is asymmetrical and interrupted by through-hole components.

In the end, I decided to put a repeating pattern in the copper layer of the board, then order it with OSHPark's After Dark colorway, which uses transparent soldermask to let the copper design shine through. I wanted an angular style to match the vibes of the rigid exposed pixel grid, but worried that an actual grid pattern could clash or appear misaligned. Thus, I chose a hexagonal pattern.

Building and importing this pattern turned into more of an adventure than I had predicted! The process I found is convoluted and by no means ideal. If this were something I was doing more often I would definitely design an automated tool for it, but as it stands, the best way I know to accomplish this is:

  1. Make your board in KiCAD.
  2. Draw a shape in Inkscape representing your edge cuts in KiCAD. Make sure the units match up!
  3. Under "Fill", choose a pattern you like and set the scaling appropriately.
  4. Export your KiCAD board as an SVG so you can use it as a reference. The menu option for this is Plot > SVG, then select F.Cu, then check to have Edge.Cuts plotted on all layers.
  5. Import your plotted SVG into Inkscape and line it up using your Edge.Cuts layer as a reference.
  6. Manually draw shapes on top of the pattern to mask off all of the areas you do not want to have your copper pattern. It is easiest to put these in a group. These can be as convoluted or basic as you like -- I mostly sketched rectangles over large components and traces.
  7. Select all of your mask shapes and your shape with the pattern and open the "Shape Builder" tool. Select the area(s) where you want the pattern to show (i.e., the areas without parts). You should be left with a shape matching your intended board, containing your pattern as a fill.
  8. Next you need to turn the pattern into a basic SVG path. The easiest way to do this is to export it as a PNG (make sure your DPI is 500ish), then re-import it and use the "Trace Bitmap" feature. Click "Update Preview", and if it looks good then "Apply."
  9. Export as yet another SVG. (Make your F.Cu traces hidden before exporting!)
  10. In KiCAD, Use File > Import > Graphics, select your layer (in our case F.Cu), then place it and manually line it up with your board!

At my girlfriend Mia's suggestion, I added a photoresistor and ADC to the circuit to implement automatic brightness control. I haven't gotten around to using this yet, mainly because the build at my work desk does not require it and the enclosure design of the one at home blocks outside light. In the future I think I could add a small hole to the bottom of the design which would let light into the enclosure but not be too visible.

PCB Assembly

This was my first time doing surface-mount soldering! I kept the components manageable (SOIC chips and 0805 passives). Overall I found the experience much easier than I anticipated. Big thanks to Mia for giving very helpful advice throughout. Thankfully, the board worked on the first try!

Enclosure

For the build sitting at my work desk, I wanted to build something that showed off the internals while still appearing sturdy and thoughtfully designed. I went with two small 3D-printed parts on opposite sides of the assembly which attach to both the PCB and the panel, allowing the device to be freestanding on a desk or table.

For the one in my apartment, I wanted to continue the theme of building something as unobtrusive as possible. Since the PCB was now the same width of the panel, if I wanted to enclose the PCB fully, the enclosure would also need to be a few millimeters wider than the panel. I took this opportunity to recess the panel into the enclosure, giving the system a much more polished look from the side view.

I am not super happy with the large gaps in the design caused by the imprecision of my 3D printer. I do like that 3D printing can give a brightly colored end result, and the form of the parts seems to lend itself well to a 3D-printed design over something like CNC milling. Perhaps I just need to get access to a better 3D printer.

The knob was given to me by a coworker who happened to be getting rid of it. It fits my encoder shaft perfectly! It is a bit taller than I would like but that is mostly a function of how much the shaft protrudes. In the future I will get a shorter encoder shaft or find a way to mount the encoder farther back. I unfortunately can't move the entire PCB back without increasing the thickness of the system because of the height of the 2x8 connector going to the LED panel.

One thing I do love about the new design is that it sits dramatically closer to the wall, making it match the thickness of photos and artwork I have. Things have come a long way from the ~50mm thickness of the original version!

Software

The largest changes on the software standpoint were extending the previous version to support multiple display sizes and moving a lot of the hardcoded logic (ZIP codes, etc) into a configuration file so my display at work could show different information to my display at home.

I hacked together a barely-working implementation to test the panel which Ava kindly refactored into something much nicer.

Conclusion

I think time has shown that I'm unlikely to stop building this kind of display anytime soon. They serve a really interesting niche of a device that provides basic information at a glance without being obtrusive, and have a unique and interesting look.

I really enjoyed getting back into PCB design and finally conquering my fears regarding surface-mount work. I hadn't actually designed and gotten a board fabricated since I was in high school, more than 5 years ago! It was nice to exercise that skill again. SMT soldering, on the other hand, is something I had never attempted before, being scared off by my general clumsiness and the precision and tools often recommended for that kind of work. While I am very glad I bought flux at the recommendation of a friend, I found it to be smooth sailing even without new tips for my iron. Perhaps I'll invest in some better tools in the future.

I loved getting to design this iteration with a focus on making it easy to potentially build more in the future. The board came out quite nice, and while there are definitely some things I would change in the future (mostly relating to making the connectors protrude farther out of the case), I am overall quite happy with the results!

Throughout this process, several few people have told me that they would love to build something similar to this for their home. I've considered a few times trying to scale up production, but the parts cost (mostly the panel itself and the Raspberry Pi) makes the total cost of each unit pretty high. There are a few companies making something similar (Tidbyt and RideOnTime are the two I've seen), but they both use small panels in a thick enclosure style like I did in the first iteration of this project. I've grown to really appreciate the large 64x64 panel size as an option, a flat enclosure, and a design more focused on information density.

I don't want to start a business selling these or anything, but I could see myself creating better documentation of what hardware to buy, making the software easier to install and use, and organizing a group buy of boards and parts. Shoot me a message if that sounds interesting to you!

]]>
https://breq.dev/projects/matrix3 /projects/matrix3 Sun, 02 Nov 2025 00:00:00 GMT
<![CDATA[FDM 3D printing for the home]]>

I've owned a 3D printer for about ten years now. I got my first 3D printer, a Monoprice MP Select Mini, when I was in middle school as the 3D printing craze was beginning to take off in "maker" culture. My dad likened it to buying his first computer as a kid. But while the home computer was on the verge of an explosion in popularity, the home 3D printer is still uncommon. The machines have undoubtedly gotten better, but FDM (fused deposition modeling) 3D printers still struggle to find broad appeal beyond the same group of tinkerers who were using them years ago.

Disclaimer: My employer works in the field of 3D printing, but that is neither FDM nor for personal use!

My journey with FDM

My MP Select Mini held up okay over quite a few years. The hotend was quite prone to clogs and other issues, so I replaced it with a generic upgrade kit one after printing an adapter I found online. Eventually the mainboard started to give up.

I now own an Ender 3 Pro printer that I impulse-bought at Micro Center a few years ago for $99. It works... okay. I've replaced the bed and upgraded to a metal extruder kit. The Z axis still misbehaves a lot, leading occasionally to layers that are too short (causing nozzle drag). I could try replacing the stepper motor, but I don't feel like paying $25 just to try something that might not be the actual fix.

Through my work on the Northeastern Mars Rover Team, I've gotten to work with slightly better printers from Bambu and Prusa. They're generally nicer to get working, but they still experience frequent issues and require consistent maintenance. They're much closer to a tool you'd find in a machine shop than an appliance you'd find in a home.

I've printed plenty of parts over the years, some more useful than others. Owning a 3D printer has enabled plenty of projects that would otherwise be impossible without a more substantial workshop, like my LED wall sign and its sequel, the custom mini-ITX case I made, and more. That said, it requires consistent maintenance to keep working, frequent retries slowing down the process, and produces lots of scrap.

While I'm happy to print models for my friends, few take advantage of this -- most people just do not reach for 3D printing when they encounter a problem.

STL websites

Broadly speaking, most personal 3D printing use cases can be divided into categories: printing existing STL models from sites like Thingiverse and Printables, and doing CAD work to produce unique parts.

The culture of sharing STLs within the hobbyist 3D printing community is huge. It's trivially easy to download a model from a website, drag and drop it into your slicer, and export G-Code to an SD card for your printer. However, in my experience, most of the models available on these sites are not particularly useful. The vast majority are either decorative parts, or utilitarian parts like headphone stands, shelving and organization products, and cable management parts.

In most cases, commercially bought parts are available at similar cost and with dramatically better build quality, surface finish, and durability. In some cases it is even faster to go to a physical store to buy a product than to run an FDM printer for hours.

I can't speak as much to printing artistic models or figurines, since that aspect never particularly appealed to me. Without postprocessing, FDM 3D prints just don't look that great.

There is undoubtedly a sweet spot here: items that are popular enough that you're likely to find an existing model, but not popular enough to warrant that product being commercially produced. Mounting brackets for specific network switches and similar parts sometimes fall into this category. But the bulk of the time I've searched for such a part, 3D models haven't been available at the usual places, and I've had to make it myself.

CAD software

A print I recently designed and made for organizing floppy disks for my Mavica.

In my experience, without an understanding of some form of CAD software, owning a 3D printer is of limited utility. Contrast this with the personal computer: most people who use computers on a daily basis do not know how to program! The typical user can meet most or all of their needs by downloading and utilizing existing software.

I was lucky to learn CAD software in middle school as part of "Technology and Engineering" class, in the golden age where SketchUp was the teaching software of choice instead of TinkerCAD. SketchUp undoubtedly has limitations (I still don't know how to draw a sphere), and the mechanical engineers I've worked with have generally laughed at me for using it. That said, I can reasonably efficiently build things.

Most people I know do not have the motivation or time to learn CAD software just to design and produce one or two parts every few months.

CAD takes a while, too! Designing good parts requires thinking about the direction of layer lines, need for supports (or lack thereof), balance between strength and filament usage/time, and more, on top of just constructing a part that fits the desired functionality and looks pleasant.

Can we make CAD more accessible somehow? The advent of tools like TinkerCAD have certainly helped, but building useful parts within its constraints can be difficult. Maybe generative AI can democratize building shitty 3D models in the same way it democratized building shitty software? The attempts I've seen at extending generative AI to CAD haven't been great.

The whole promise of 3D printing is customized parts built at much faster timescales and with less equipment than otherwise possible. But the typical home user only has the occasional need for a customized part. I use my printer about once a month, sometimes less, mostly because of the time investment of CAD.

Fitting into the home

My 3D printer sitting atop an IKEA KALLAX. It is quite big!

Will 3D printers ever be common home appliances?

One could argue that the limited utility of them precludes that being a possibility. But the success of Cricut machines proves that "maker"-oriented devices for the home can be successful! The use cases for the Cricut are similar to those of an FDM 3D printer, they are about a similar size, and the process of using one is similar. But Cricut machines require far less tuning and setup, making useful things on a Cricut requires much less specialized knowledge, and I would guess that Cricut designs have a higher success rate on average than 3D prints.

I think the type of community surrounding hobbyist 3D printing also plays a role. 3D printing fans are often willing to tinker with their printers, want the ability to customize them, and want to bring their own slicer and tooling. That's great for consumer advocacy, as evidenced by the failure of RFID-chipped vendor-locked filament among all but the most obscure printers. But it tends to direct the innovation in the 3D printing space towards chasing ultimate customization and featureset instead of building a cohesive user experience like Cricut has. I recognize that buying cheap hobbyist-oriented printers makes me part of this problem, but ultimately, that's the type of printer that fits most home users' budgets.

I really love owning a 3D printer! And I really wish I got to use it more. I am a huge fan of customizing my living space to fit my life with objects that exactly meet my needs, fit my aesthetic, and work with the other objects in my home. 3D printing allows me to do this, and I think that's wonderful.

]]>
https://breq.dev/2025/10/03/fdm /2025/10/03/fdm Fri, 03 Oct 2025 00:00:00 GMT
<![CDATA[I bought a Sony Mavica!]]>

I bought a Sony Mavica! And everything about this camera has been such a joy, I can't help but write about it :)

The Digital Mavica line of cameras saved images directly onto a floppy disk. The model I own is the MVC-FD90 which was released in 2000. It captures images at a maximum resolution of 1280×960, but can also shoot at 640×480 to save space on the floppy disk (which can store around 20 photos at that resolution).

Inspiration

Earlier this year at Anthro New England, my girlfriend Mia was taking photos on her Mavica and let me try it a bit! It seemed like such a fun and whimsical camera to keep around.

While I was intrigued by the camera, I was a bit intimidated by the floppy disks -- Mia has much more experience with retrocomputing than I do, and I worried that getting disks, a drive, pulling images off the camera, and keeping everything in good shape might be too difficult for me.

Swapfest my beloved

It was at the most recent MIT Swapfest that I picked up my beloved Mavica. I purchased it from some very persuasive transfems who had a table dedicated exclusively to Mavica stuff. I bought the model they recommended and they helped me get set up with a battery, strap, etc. I figured that, now that I'm getting settled into post-college life, now was a good time for me to pick up something new!

I'm a big fan of Swapfest: it's almost more of a community gathering to me than a place to buy stuff. Whenever I go I see so many friends, former classmates, and colleagues I know from work. You get to know the tables that come back every month and the people who sell regularly. All in all, it's a great way to spend a Sunday morning!

Meticulous unscheduled disassembly

Mavica experts will know that aftermarket batteries tend to be a bit oversized compared to the originals. The folks who sold me mine made sure I was aware of this and even gave me a ribbon to put around the battery. However, in my excitement to try it out, I totally forgot about this and immediately put the battery directly in. I had to disassemble the camera to get it out!

Big thanks to Sony for putting that plastic cover over the flash capacitor... that could've been a nasty shock otherwise 😬

My first disk

A few tables at Swapfest hand out free floppy disks to passerby. I had accumulated a total of three disks and found that, miraculously, one of them actually worked! I had a mild panic moment when I realized the camera couldn't always successfully read all of the photos on the disk, it turned out to be easily readable by a USB drive.

At the time, I didn't have any way to read files off of the disk, so I just took as many as I could until it filled up.

Getting serious

By this time, I had had enough fun with the camera that I figured I should actually get ahold of some disks and a floppy reader!

Mia advised against me buying a random floppy drive on Amazon, so at the recommendation of my friend Tris, I bought a Dell Latitude drive online for about $6. It's designed to be put into a slot on the laptop, but it actually just has a normal Mini-USB connector. I bought some disks from an Amazon listing that looked legit (it was Maxell brand and had good reviews).

The disks arrived just in time for me to take them on a trip to New York City!

Data recovery adventures

After getting back from NYC and finally being able to read the disks, I discovered that a few of the photos weren't readable. This wasn't a big deal (the photos I cared most about were fine), but it definitely taught me to be less careless with floppies. My backpack has dozens of tiny magnets in it and I struggled to find a spot far enough away from all of them!

Mia recommended I use ddrescue to try to recover parts of the images. It ended up not giving me anything more than I got from copying files over, but was interesting to learn. I could definitely see it coming in handy in the future.

sudo ddrescue /dev/sda hdimage mapfile
sudo mkdir /tmp/hdimage
sudo mount -o loop hdimage /tmp/hdimage
cp /tmp/hdimage/* .
sudo umount /tmp/hdimage

To reformat the disks, I used a tool called ufiformat to re-lay the tracks and then used the Mavica's built-in disk format tool to recreate the filesystem.

sudo ufiformat /dev/sda

More to come

Thus far I'm really enjoying shooting on the Mavica! Limiting myself to a specific number of shots per disk helps me be more thoughtful than with a modern digital camera, but unlike with film, if I take a bad shot I can just delete the file -- I haven't wasted anything.

Dealing with floppy disks as a medium is much more fun and less stressful than I anticipated! The disks themselves are cheap, devices for reading and writing them are cheap, software support is quite good even on modern machines, and with a little care they're easy enough to keep in working order.

I've been in a bit of a lull with photography lately, but I think I might've found something to get me back into it!

]]>
https://breq.dev/2025/07/01/mavica /2025/07/01/mavica Tue, 01 Jul 2025 00:00:00 GMT
<![CDATA[My Experience with Gender]]>

I've noticed that I can be far too quick to assume that the experience of others matches my own. This is one of the reasons I've held off on writing about being transgender fo so long: I figured I would just be telling my transgender audience things they already know. But over the past four years, I've realized that the way I experience gender has substantial differences to the way every other trans woman I've talked to experiences it, just as every trans woman I know has a unique and deeply personal experience compared to any other. I am to highlight this broad range of experiences when I portray my own.

The other reason I have held off on discussing my experience is that I often mistake the hesitance of my cisgender friends to ask questions as apathy. While I know many other transgender people are less open about their experience (which is understandable!), I have always been overjoyed to have conversations about gender and my existence as a trans woman with the cis people in my life. I hope this post can lay out the groundwork and serve as an invitation for people I know to start these conversations with me.

My wonderful girlfriend Mia has also written a blog series about being transgender, which you can find here. You should read that one after this -- I always look up to her wisdom when it comes to big topics like this :)


Update: My girlfriend Ava and my friend Tris have also written very insightful posts about this!

Early Transition

A commonality between most transgender people I know is that our decision to transition is defined by strong emotions. Deciding to transition is deciding between something very known -- your everyday life, as it is now -- and something entirely unknown. It is impossible to know beforehand what emotions you will experience after coming out. Nobody can tell you with perfect accuracy if transitioning is something that would make you happier.

The only way to know for certain is to try it in a truly immersive sense: name, pronouns, and the way you present yourself in every aspect of life. Unfortunately, we live in a society where this is often somewhere on the scale of impractical to impossible. And thus, most trans people are going in blind.

In my case, the strong emotion that determined my decision to transition was fear. Not fear of stepping into this unknown experience, but fear of not doing so. I was depressed in the years leading up to the start of my transition (for me, the last two years of high school). Had I not transitioned, I may not have made it to here today. Despite being a complete unknown, transition felt like my best chance of survival.

While I know my story is far from unique, it brings me happiness that there are those for which this decision is led primarily by positive emotions, too.

I chose to delay my social transition until starting college since moving cities let me avoid or postpone coming out to people I already knew. While diving headfirst into being "Brooke, she/her" was overwhelming at times, it was better than having to hold a "coming out" discussion with everyone in my life at the time.

It is a cruel irony that one of the most difficult parts of transition -- having to explain to everyone in your life what being transgender is and answer countless questions about something so deeply personal to you -- happens so early on in the process, before you have had a chance to build confidence in yourself or even grow to understand yourself. It makes me happy to see communities in which gender fluidity and questioning are more widely accepted and supported.

For the first year of my transition, I honestly didn't have the energy to think critically about my own experience of gender. I knew that presenting femininity felt nice, but the details didn't matter -- I was too focused on learning how to buy clothes that I like (and throwing out everything I knew about how clothes were "supposed to" fit), learning how to take care of long hair, and overcoming my fears, one step at a time.

I take my confidence for granted nowadays, but I still vividly remember my first time leaving my dorm room wearing a skirt. I remember noticing how much my hands were shaking when reaching for the elevator button, and trying to make myself small in any way possible. I am six feet tall, and while I enjoy being loud and confident nowadays, there was a long stretch of time when I just wanted to blend into the background.

Gaining confidence

The next year of my transition is when I finally started to feel like I could "take up space" in social gatherings. I dyed my hair bright pink for the first time and started to assert myself more. I finally started to feel okay with using the women's restroom instead of walking halfway across campus for a gender-neutral one.

While I made a few false starts at voice training, over time I realized that I had become comfortable with my voice as it is without deliberate training. While I'm less likely to be gendered correctly by strangers in public, I can still feel safe in that my friends, peers, and coworkers will not see it as evidence that I am any less of a woman. While I'm not ruling out voice training in the future, it would be on my terms.

My goal in transition is not to assimilate into cisnormative society. It is to wake up each morning and present myself to the world in a way that sparks joy.

Being transgender does not define my life nowadays to the extent it used to a few years ago. On one hand, the day-to-day stresses of my early transition years have subsided, and I am much happier as a result. On the other hand, I do think my trans-ness -- my experience of transitioning, my presence in transgender social circles, and the unique perspective I have as a result -- is one of the most notable pieces of my life.

I would say that in my day to day life, I am a very happy person. I would also say that my trans-ness is an aspect of my life that brings me happiness. This would have been unfathomable to my past self.

Finding stability

The word "transition" implies that the process of changing one's gender happens over a well-defined time interval. This is, in my eyes, inaccurate.

In one sense, my transition "ended" when I made the commitment to myself that I was a woman. The social steps -- changing the way I dressed, coming out to friends and family, updating my legal name -- were all just logistics.

In another sense, transition is never "done." As my body cannot naturally produce estrogen, I will keep taking hormone replacement therapy for as long as I live. The style of clothing I wear will continue to shift throughout my life. There are still people who knew me before transition and who I will need to come out to if we ever cross paths again.

It is only recently that I have felt stable enough to reconsider some of the fundamental questions of my transition. I enjoy the way I dress, the way people refer to me, and the effects that feminizing HRT brings to my body. But the label of "transgender woman" was less something I decided on and more something that was handed to me as a result of this.

I've begun to question whether "non-binary" describes the way I experience gender. I've always felt a bit of dissonance with being referred to as a woman, and for the longest time, I wrote that off as imposter syndrome. But four years in, I'm as confident as can be, and that feeling is still there.

Perhaps this questioning could only happen this many years in. It's taken time for "they/them" pronouns to stop feeling like a way for people to avoid acknowledging my transition and gender, and start feeling like them acknowledging the complexities of my gender beyond the binary.

Alternatively, perhaps this was down to a lack of representation of non-binary people in my social groups. While early in transition, I sought out friends with similar experiences to me, and ended up in groups primarily consisting of binary trans women. At present, several close friends of mine are non-binary, and I've been able to have many more interesting conversations about gender with people outside the gender binary. These conversations have undoubtedly been a catalyst for questioning my own gender identity.

Conclusion

There has been an ongoing bit on this website since three years ago in which, at some random probability, the pronouns displayed on the homepage will be either "she/her" or "she/they". I wish I could remember what was in my head when I wrote that.

One explanation is that labels have always felt unimportant in relation to my own experience. While I recognize how powerful labels can be in helping people explain their identity and connect to others, I find their imperfect fit frustrating. I care less about whether I am called "she" or "they," and far more about the motivations of those calling me this. Am I being called "they" in an attempt by the speaker to avoid recognizing or legitimizing my identity as a woman? Or am I being called "they" in recognition of the fact that my identity does not perfectly map onto the gender binary?

I hope that in five years I can re-read this post and find myself disagreeing with parts of it. I hope that my perception of myself, of transness, and of the way I fit into society will shift over time. I want to continue thinking about, exploring, and experimenting with my own sense of gender identity.

]]>
https://breq.dev/2025/06/22/gender /2025/06/22/gender Sun, 22 Jun 2025 00:00:00 GMT
<![CDATA[My Hair Routine]]>

I have very thick, curly, bleached hair, which is about as "hard mode" as you can get. I have gotten so much bad advice from people about how to care for it. After a few years and some tips from friends, here's the system I've ended up with!

I've added products I use in the toggle elements below, since I think providing specific examples can be useful, but I don't want this to come across as too much of an endorsement. I know enough about this topic to solve problems, but I'm not 100% sure I've landed on the best set of products for my hair. You, dear reader, likely have different hair with different needs anyway.

Last year at the University Rover Challenge. I've grown out my hair quite a bit since then!

Requirements

To give some context, here are the requirements I have for my hair routine:

  • It must actually solve the tangle problem (this one is very hard)
  • Regular maintenance must be a reasonably straightforward process, since I am lazy
  • Products should be as easy to get as possible (ideally, I should be able to get everything from a CVS or Target)
  • The process should not impede my ability to travel -- I should easily be able to pack a small, TSA-legal bag for week-long trips with necessities

Regular maintenance

Whenever I shower, there are two main hair-related tasks I focus on. Detangling is the endless struggle I face, and what the majority of my routine is built around. Washing ends up essentially being a by-product of the detangling process, not the other way around.

I generally almost never use shampoo! My regular process consists of two steps: a washing conditioner (aka "cowash") and a leave-in conditioner. I first add the washing conditioner at the roots of my hair, scrub it in, let it sit while I wash my body, loosely detangle my hair with my fingers, then rinse it out. Then, I add leave-in conditioner, focusing on the roots of my hair, and use a wide-tooth comb to continue detangling.

Often when I mention that I don't use shampoo, the response I hear is "my hair is so oily, I could never," and this is often valid! I am lucky that oily hair is a problem that I do not face, and I find cowash to help with tangles much more, so it is the solution I use.

Products

I use As I Am Coconut Cowash as my washing conditioner. For leave-in conditioner, I use either As I Am Classic Leave-In Conditioner or Kinky-Curly Knot Today. I use a Wet Brush Detangling Comb with the zig-zag teeth.

Less frequent maintenance

Very occasionally, I will use shampoo if I need to clean my hair more vigorously than washing conditioner allows. I've heard that sulfate-free types are better for "low-shampoo" routines like mine, since they don't leave oils as much.

I do also have a styling cream I can use, but I don't find myself reaching for it very much.

Products

The shampoo I use is Herbal Essences Bio Renew Aloe + Eucalyptus Scalp Balance Shampoo. The styling cream is Shea Moisture Curl Enhancing Smoothie.

Coloring

Me in 2021 at the start of college -- I looked pretty different at the time!

Unfortunately, bright pink is not the natural color of my hair, so I need to dye it every month or so to keep it colorful!

I dye my hair DIY-style, for two main reasons. The most obvious is that I started off with hair dye as a student -- I can afford to get it professionally dyed now, but I couldn't back then. But more importantly, I view it as a really fun activity to do with people I'm close with. The joy of doing it with someone I know vastly outweighs the somewhat inconsistent results I get :)

I don't always use the same shade of dye each time, but the resulting color stays pretty consistent since it blends with the residual color from prior applications.

Every few dye sessions, I touch-up bleach the roots of my hair. The first initial bleach was a massive ordeal that took three boxes, but since then, I've just used one box every few months.

I also dye my girlfriend Ava's hair with the same setup.

An older shade of dye I previously used. Picture from 2023.

I use a semi-permanent dye (doesn't require developer) since it's much less stressful and easier to apply, and it seems to last as long as the "permanent" ones I've used. (I've accidentally fallen asleep with hair dye in and been fine!) I used to use a boxed "permanent" dye (and even once went on a 2+ hour bus journey into the a CVS in the Boston suburbs when the chain stopped stopped carrying it just so I could buy up the remaining stock), but in hindsight I didn't like the color it gave as much.

Products

I use Arctic Fox dyes, mostly Virgin Pink but sometimes Electric Paradise. I've tried Girls Night and Porange but haven't continued to use them. Ava's hair is Purple AF. I also use a mixing bowl and brush from Arctic Fox.

For bleach, I usually buy a generic boxed kind such as Colorista All Over. I don't use the included toner since I'll immediately put color on top of it anyway.

The original box dye I used was got2b Metallics Sakura Pink.

Color conditioner

A few people I know have recommended creating a color conditioner by combining my conditioner with a few drops of hair dye. I'm interested in this idea to supplement my regular re-dye process, but there are a few reasons I've held off:

  • I worry about color bleed from my hair staining things -- currently this is only a problem for the first few washes after a dye session, but if I am constantly reapplying dye, it could cause more problems
  • I like being able to loan my hair products to friends who are staying the night, etc., most of whom do not want pink hair
  • I'm happy with the way my two types of conditioner work, and I don't want to mess with the system :)

That said, I will probably still try this at some point once the time is right!

Travel

Me this summer traveling for the University Rover Challenge.

I take a lot of overnight or weekend trips, and I like not having to worry about having my favorite hair stuff with me, so I keep my hair products in 3oz travel containers. It's also quite helpful for fitting all of the containers onto limited shower shelf space, and for dispensing small amounts of cowash (since mine comes in a tub).

I haven't found a good solution for labeling these yet, and I'm worried that I'll eventually mistake two similar-looking products, but so far I've been getting by with occasionally re-writing labels on them with sharpie.

Products

I just bought a random set of containers from Target, I can't find the exact one I got anywhere. But the important part is just that it's 3-4 small bottles with squirt lids. Search "TSA Travel Container Set" or "TSA Travel Bag Kit" on the website and you'll find similar stuff.

Conclusion

This system has worked for me for the past ~4 years! I think it balances my needs well and I'm proud that I've found something that works.

A recent-ish picture of me having fun in the Rover lab, with a slightly oranger shade of dye.

]]>
https://breq.dev/2025/06/21/hair /2025/06/21/hair Sat, 21 Jun 2025 00:00:00 GMT
<![CDATA[Tiny Devices I Love]]>

I've developed a strong appreciation for small devices that solve problems for me. While larger versions of these tools exist, there are a few reasons why these tiny tools hold a special place in my heart.

The first is that I have spent the last several years living in small dorm rooms and now live in a relatively small apartment. While I wish I had the space for a dedicated electronics workbench, the practical reality is that my tools need to fit either on my normal-sized desk or be stored away in a drawer. The second is that, both for my work with the Northeastern Mars Rover Team and for my own projects, I do occasionally end up needing to fix, debug, or bodge something together in locations which are less conducive to this type of work (such as the middle of a grassy field).

The criteria I decided on for devices in this category include:

  1. The device must be "cute." More formally, this includes things that are about the size you could fit into a large pocket. It also means that the device must be smaller than most others in its class.
  2. The device must be something I would carry somewhere to perform a task. While I love my tiny Netgate 1100 (a gift from Ari), it's not something I would transport to somewhere for a temporary installation.
  3. The device must be good at what it does. It must not sacrifice functionality to become its small size.

Acknowledgements: This post was loosely inspired by my friend Hunter's Devices of All Time post. Many of these devices were gifts from close friends!

iFixit Moray Driver Kit

Gift from Ari and Hunter!

This is a screwdriver kit including a small screwdriver and a large collection of bits. It comes in a hard plastic case, the lid of which doubles as a tray for storing screws.

I've used the Moray's older sibling, the Mako, extensively on the rover team, but haven't had my own iFixit kit until recently! I like to flip the bits I'm using for a project down to visually distinguish them from the rest and make them easier to grab.

Likes:

  • Very small and cute. Easy to throw into a backpack or tool bag.
  • Extremely versatile, good selection of bits. Replaces the need to carry a whole range of screwdriver sizes.
  • Handle size is large enough for "big screwdriver" tasks, but small enough to allow for delicate work.
  • Spots for bits are labeled with both an icon and text.

Dislikes:

  • No 2.0mm hex bit. (I encounter M2.5 screws enough that I'm surprised this was omitted!)
  • While additional bits are available, there aren't any additional spots in the case to hold them. (It's possible to store them within the screwdriver, but that makes them harder to access.)

FNIRSI DPS-150 DC Power Supply

Gift from Mia!

This is a benchtop DC power supply, which is an essential tool for electronics prototyping. This type of power supply is useful as it precise voltage control and current monitoring and limiting to protect your circuit. It provides power through "banana plugs" or directly to wires via the two terminals.

Likes:

  • Taking power from a USB-C port greatly reduces the size of the unit.
  • Screen is quite readable and useful for precise adjustments.
  • Maximums of 30 volts and 5 amps.

Dislikes:

  • Display hinge struggles to hold itself up.
  • Voltage/current input is a bit unintuitive.

GL.iNet Opal (AC1200) Router

This is a "travel router": A small, lightweight router and wireless access point powered over USB-C. This one in particular is quite cheap and actually ran my home network for a bit until I was able to set up the pfSense unit I use now. It's also saved the day at Rover events a few times when I've used it to establish a wireless network in areas otherwise slightly too far from a nearby building's coverage.

Likes:

  • USB type-C power input.
  • Two LAN Ethernet ports.
  • Strong featureset provided by OpenWRT.
  • Antennas fold in for compactness when not in use.

Dislikes:

  • Runs a very old patched version of OpenWRT, not the mainline version.

RadioShack 22-182 Digital Multimeter

This was a gift from... my dad probably? I've had it as long as I can remember.

This is a multimeter: a device that measures the voltage difference between two points, the current flowing through itself, or the resistance of an electrical component. While some meters support "autoranging," this one requires that you specify the range of values you expect the quantity you measure to be in (e.g. 200V, 20V, or 2V). I honestly prefer manual ranging meters, since dialing in the voltage at the start is usually easy to do and the meter responds more quickly when it doesn't need to go through the autoranging sequence.

Likes:

  • Carrying case keeps probes inside.
  • No need to switch probes between sockets for current vs voltage measurement.

Dislikes:

  • Current measurements limited to 200mA.
  • Case is slightly too small for probes to fit comfortably.
  • No beeper for continuity testing.
  • Uses an uncommon 12V battery.
  • Not as precise as larger meters.

Pinecil V2

Gift from Ari!

This is a soldering iron: a tool that heats its tip up to a high enough temperature to melt solder (metal) and join electrical wires. Other potential uses include pushing heat-set metal inserts into plastic parts. This is a temperature-controlled iron, which is essential to allow heating solder efficiently without damaging components from excessive heat.

Likes:

  • USB-C power input.
  • Readable monochrome OLED display.
  • Indicators for iron cooldown.
  • IMU-based standby power-off.
  • Easily swappable tips.

Dislikes:

  • Doesn't include a stand or case of any sort.
  • Some intermittent issues with connectivity between the iron and tip.
  • Two-button based interface is sometimes unintuitive to use.

AIOC (All-in-One Cable)

This is from a run made by Mia, although I believe I paid her back for it? Mia if you're reading this and I owe you money, please let me know :)

This is a device with two purposes: to upload configuration data ("codeplugs") to handheld radios, and to connect a handheld radio as a soundcard to a computer. The former use case is nice for programming radio channels as it allows configurations to be easily created and distributed. The latter use case enables computer modulation techniques such as AX.25.

Likes:

  • Smaller than conventional radio programming cables.
  • Alignment between 2.5mm and 3.5mm jacks is not sturdy.

Dislikes:

  • Can't be used to program my Retevis RT3S.
  • 3D printed case is flimsy.

Conclusion

You'll notice I didn't put direct links to where to buy any of these (and some of them, like that Radio Shack multimeter, are likely impossible to get these days). My intent is to show how these tools fit the style of work that I do and enable me in ways traditional alternatives don't, and to maybe help you consider how the tools you use in your work shape the projects you take on and the methods you use.

]]>
https://breq.dev/2025/05/02/pocket-size /2025/05/02/pocket-size Fri, 02 May 2025 00:00:00 GMT
<![CDATA[Mini-Rack Homelab]]>

I recently reorganized all of my home networking hardware into a 10" rack that sits in an IKEA KALLAX shelf next to my desk.

Why Mini-Rack?

I was motivated to take this project on by Jeff Geerling's Project Mini Rack website, which collects links to hardware and guides for rackmount builds in the 10" form factor. I've wanted to build out a rackmount system for ages, partly because I want a rigid frame for my networking hardware (as opposed to a loose collection of boxes sitting atop a shelf), and partly because I love the aesthetics of rackmount gear. While I could undoubtedly achieve the same goals in this project without using a rack at much less cost, I've wanted to play with something in the rackmount form factor for years. I had largely written it off as impractical given the small size of the apartment, but the proliferation of 10" gear has made it possible to set up a compact rackmount homelab system.

Goals and Improvements

My existing home network setup is designed for a small apartment which I share with my girlfriend Ava. We get internet service from a cable connection (no fiber here unfortunately), and we bought a basic Arris SURFboard modem. From here, we use a Netgate pfSense router gifted by my other girlfriend Ari, which also runs a site-to-site Wireguard VPN connecting to her home network. The router connects to a Ubiquiti U6 Mesh access point which both provides our personal Wi-Fi network and functions as a working eduroam service provider with client isolation (the details of which will remain a mystery).

The wired side of the network runs through a Netgear GS105 switch which I got at MIT Swapfest for five bucks. (My switch was actually, unbeknownst to me, featured on my friend Hunter's blog last year.) From here, wired connections run to my desktop and Ava's desktop.

My primary goal for this project was to reduce the clutter on my shelf and replace it with something that looked cool. I also wanted to finally have a good way to self-host things again. While I'm happy to keep most services hosted online, it would be nice to have some services running locally where convenient.

Networking Setup

I chose a RackMate T0 case, since it would fit within a single shelf of the IKEA KALLAX next to my desk (and because the next size up was 8U and I doubted I would have enough hardware to justify that). The case arrived with a broken top acrylic panel, but the folks at GeeekPi (DeskPi) shipped me a replacement quickly without any hassle! In the meantime, I got to start building out the rest of the rack.

No rack is complete without a patch panel, and mine needed some way to pass cables through from the front-mounted Ethernet ports on my router and network switch. I chose a 12-port model from DeskPi, designed for the rack I used. This provides more ports than I need for my modem connection, access point, and for our desktops, but the Ethernet ports can be swapped for any keystone block in case I want to use them for any other purpose.

One problem I wanted to solve with this design was the bundle of power bricks tangled up behind my shelf. I saw the DeskPi PDU Lite available, and decided to use it in combination with a 12V 8A power supply to power devices in the rack. Right now, it powers my modem, router, and GS105 switch, but there are 4 additional channels which could be used for an SBC, hard drive mount, etc.

Proxmox

After using the rack for some time, I finally broke down and spent the money on an actual server. I went with a refurbished Lenovo ThinkCentre M710q that I bought for around $75, since it had relatively modern hardware, would fit within a 10" 1U shelf, and uses relatively low power.

A few friends recommended I try Proxmox for this project, so I decided to give it a try! It supports full VMs as well as containers, which is nice. Previously I've relied on solutions like Dokku which only support container-level virtualization.

UniFi Network Controller

Shortly after I moved, I invested in a Ubiquiti access point for the new apartment, and I absolutely do not regret my decision. However, this AP has one minor annoyance for home use compared to a traditional residential access point. Instead of being configured via a web UI running locally on the device, Ubiquiti APs are configured by a central server. This approach is great for managing a large deployment of access points but slightly annoying for a home setup.

However, there's an upside: Ubiquiti's controller software can be self-hosted rather easily! I decided to do this first, and after spending 5 minutes trying to find where to install container templates from, I got the Unifi controller running on an Ubuntu container. After exporting and importing my config to transfer it to the new controller, I can finally manage access point settings without needing to boot my desktop!

HomeAssistant

Ava is a big fan of smart home devices, so we've ended up with a dozen smart light bulbs in the apartment. While running these through Google Home works great for basic usage, the lack of an extensible API is something we've missed dearly. Our home has already started to collect weird and wonderful homemade devices, and an open platform would give us much greater opportunity for fun shenanigans.

On the recommendation of a friend, we're trying out HomeAssistant to run our devices. Most of our devices are based on the Matter standard, which should (in theory) enable fully local control. The process of importing Matter devices went pretty smoothly, although we needed to tediously rename each device we added. We do have a few weird bulbs we bought to fit in an IKEA fixture that used the Tuya app, but HomeAssistant had an integration for those as well which seems to work perfectly.

Reverse Proxy

With all of these services on the same box, I'll definitely need to set up a reverse proxy for these services. My go-to for this is usually nginx, but I decided to try Caddy instead to avoid the hassle of setting up HTTPS certificates. While I did find it more annoying to debug, adding new hosts and proxying new services is turning out to be a breeze so far!

The reverse proxy also serves a basic static site, mostly as an excuse for me to make a cute drawing of the rack and my desktop.

Syncthing

For my "hot workspace" of files (documents, code, etc), I like to store things in Syncthing so they're available on each of my devices. It's the best cross-platform tool for file sync that I've found, and is really easy to self-host!

I created a container and followed the Syncthing setup steps for Ubuntu. The only snag I hit was that I couldn't remotely login to set up the GUI from a different device. Turns out the approach is to edit the config to replace 127.0.0.1 with 0.0.0.0, which is pretty straightforward:

nano /root/.local/state/syncthing/config.xml

The documentation for setting it up to run with systemd also worked perfectly.

NAS

I tend to keep only actively used files on Syncthing since my devices are constantly running out of storage space. For long term storage, Ava and I used to use NextCloud, but found it annoying to keep it running and stable. Thus, we switched to a basic Samba and WebDAV share for this.

While I don't immediately have hard drives ready to add, I was able to get started by just setting up a container and using a folder on its root volume. Installing Samba was straightforward. I do eventually want to install WebDAV as well, but that will have to wait for another time (I've been sitting on this blog post for far too long!)

OctoPrint

OctoPrint is a server for managing a 3D printer remotely. I recently dusted off my old Ender 3 Pro and started making stuff with it again, and now that I've graduated and work a normal work schedule, I have fewer opportunities to check on a print throughout the day.

I decided to use the popular octoprint_deploy for installation. Since the README warns against trying to deploy in an LXC container, I made this host a full Ubuntu Server VM. I was able to pass through the USB device easily by its vendor and product ID, and everything worked on the first try! I don't have a USB camera for remotely monitoring yet, but I might decide to add one if I find this useful.

General purpose Ubuntu and Windows servers

I find myself occasionally booting into Windows 11 for things, most recently for uploading a codeplug to my OpenGD77 radio. Since I almost never use this partition otherwise (the video games I play all support Linux at this point), I really only need a basic Windows 11 box with USB passthrough support. I got the idea for this from my girlfriend Ari who uses a Windows VM for similar purposes. I also set up an Ubuntu desktop machine at the same time, mostly because it was easy to do so.

Results and Conclusions

First off, I would like to think Ava for putting up with me occasionally disrupting our home internet to perform upgrades or maintenance. Thanks also to Ari, Hunter and Tris for helping me troubleshoot on the countless occasions I accidentally shot myself in the foot with Proxmox. This project veered farther into the realm of networking and system administration than I ever had before, and I couldn't have done it without help.

Taken as a whole, this is probably one of my more practical projects as of late: I'm finally able to clean up my convoluted file sync situation, I've gained the ability to remotely manage stuff on my home network, and I've already enabled a few helpful automations with HomeAssistant.

]]>
https://breq.dev/projects/minirack /projects/minirack Tue, 29 Apr 2025 00:00:00 GMT
<![CDATA[orb.breq.dev]]>

Sitting right next to my desk as I type this is an IKEA FADO lamp fitted with an RGBW LED bulb. Among friends, it's become affectionately known as "the orb" due to its unique shape.

Implementation

This project uses quite a few moving parts to pipe lighting data through from the user's browser into a smart light bulb. I'll trace the flow through each of these components.

The user chooses a color through a basic web UI written in vanilla HTML, CSS, and JS. There's a sketch of the orb that I made in Inkscape to provide a preview.

The UI makes calls to a Flask-based backend, which turns around and calls HomeAssistant. This layer exists only to allow unauthenticated users to call these two specific methods on the HomeAssistant API (since the Flask app keeps a bearer token around). I deployed this on Dokku, which is a stack I've developed a strong familiarity with.

While I initially tried to make this project work with Google Home, I couldn't find an API that would allow me to invoke commands programmatically. Ava and I had been looking for an excuse to migrate to HomeAssistant for a while, and this was it! We found the setup process to be quite straightforward: most of our smart devices are Matter bulbs, which could easily be added to both Google Home and HomeAssistant.

HomeAssistant gives each entity an "Entity ID", which makes programming easy. The orb light is just lights.orb! While finding the right documentation was a bit difficult, invoking actions on entities through the REST API was overall easy and straightforward.

We've deployed HomeAssistant to a Proxmox server running in our home mini-rack. Since our HomeAssistant instance is publicly routable at homeassistant.home.breq.dev, no special considerations were needed for piping together the Flask app deployed in the cloud with HomeAssistant running in our home network.

Results

In and of itself, the orb demo isn't that interesting, especially because you need to actually be in front of the light to see it working (and if you're already physically in my apartment, you can just use the Google Home to set the light). That said, it's a fun party trick, and I enjoyed finally being able to interact with my smart home devices easily from code! I'm definitely looking forward to doing more smart home projects in the future.

]]>
https://breq.dev/projects/orb /projects/orb Tue, 29 Apr 2025 00:00:00 GMT
<![CDATA[Greatest Hits of the MOS 6502]]>

Over the years, I've developed a strong appreciation for the MOS 6502, mostly through my work writing an emulator (something I feel everyone should do). In researching this chip, I've stumbled across countless creative tricks and workarounds both in the design of the chip and the vast library of software written for it.

Zero Page

This one's more of an interesting design choice: the 6502 has an absolutely tiny register file. The only general-purpose register is "A", the accumulator. There are two additional registers "X" and "Y", but they are only used for indexing memory.

However, the designers of the 6502 realized that this was extremely limiting to programs and would require lots of reads and writes into memory to handle temporary values computed in a program. Using the "Absolute" addressing mode, this requires two bytes for the memory address and around 4 cycles depending on the instruction. The "Zero Page" addressing mode is a hack: when a Zero Page instruction is used, only the lower byte of the memory address is stored, and the processor can skip a cycle! This both reduces the program size and speeds up program execution. As a result, the first 256 bytes of memory are effectively "registers" of sorts!

The "CPS pin"

Like many architectures, the 6502 has a status register with flags like "zero", "overflow", and "negative" representing the last operation completed with the ALU. Unlike these architectures, though, the 6502 has a fun quirk: the chip has a physical "SO" ("set overflow") pin which, when the signal on it goes from high to low, sets the overflow flag in the status register. This flag can be used by subsequent branch instructions.

The usefulness of this pin is questionable, and it rarely saw usage in practice. It was apparently used in tight loops for routines interfacing with hardware, since a transition on the pin could cause code to jump out of the loop. The pin was originally called the CPS pin, short for "Chuck Peddle Special" (named after the main designer of the 6502, Chuck Peddle).

The BRK Instruction

On the 6502, opcode 00 maps to the "BRK" instruction. This is essentially used to trigger an interrupt from within the program.

In the 6502, the location in memory that code is executed from after a reset or interrupt is controlled by a table of "vectors" located at the very end of the memory map.

SignalLocation
NMIFFFA-FFFB
RESETFFFC-FFFD
IRQ/BRKFFFE-FFFF

NMI and IRQ are for the two pins on the microprocessor of the same name ("non-maskable interrupt" and "interrupt request" respectively). RESET is for when the microcontroller boots up or is reset. BRK shares its vector with IRQ, meaning the code routine which services maskable interrupts is the same routine which services BRK instructions. So, when you run BRK, your code immediately jumps to the interrupt routine. Once the interrupt routine returns with the RTI instruction ("Return From Interrupt"), your code keeps executing.

BRK was intended for use in debugging (as a "breakpoint" of sorts). In some systems, this was what happened: the Apple II would launch the program monitor which would allow you to debug your program. However, this instruction has a weird quirk: despite BRK being a 1-byte instruction, when the interrupt routine returned, the program counter would jump forward by 2 bytes instead of 1 -- leaving a single byte after the BRK instruction unused. Clever programmers could exploit this by looking at the top of the stack and reading this value, using it to pass state between the code calling BRK and the interrupt service routine.

On the BBC Micro, BRK was used for error handling. To throw an error, the BRK instruction would be called with the error code stored in the byte immediately following it. Since returning from an error handler was not supported, an error string could be stored immediately after the error code byte additionally.

On the Apple III, Apple's Sophisticated Operating System used BRK for system calls. The system call opcode would be placed immediately after the BRK instruction, then a 2-byte pointer to additional parameters. This did mean the SOS routine responsible for system calls needed to increment the return address by 2.

Wozniak's SWEET16

The 6502 is an 8-bit processor, but addresses and other values are often 16 bits. While it's of course possible to manipulate these larger values using a combination of 8-bit operations, the resulting code is often quite verbose. Steve Wozniak, known for his obsession with going great lengths to save a small amount of resources, did not want to deal with this, so he created an interpreted bytecode language called SWEET16. A program could call into a subroutine, and the code after the subroutine call would be interpreted as SWEET16 instructions until the SWEET16 "Return" instruction was executed, at which point the following code is executed normally.

SWEET16 is an interesting tradeoff for code size versus performance. While it massively reduces the size of code, interpreted SWEET16 code is about one tenth the speed of native 6502 code. While optimizing for code size was usually preferred in the past, modern processors have improved slower than modern memory has, so this tradeoff is less relevant now than it once was.

SWEET16 supported several "nonregister operations". These included branch operations, which used a 1-byte signed offset as the branch target.

OpcodeMnemonicFunction
00RTNExit SWEET16, return to native 6502 code
01BR eaBranch always (i.e., jump)
02BNC eaBranch if No Carry
03BC eaBranch if Carry
04BP eaBranch if Plus
05BM eaBranch if Minus
06BZ eaBranch if Zero
07BNZ eaBranch if Non-Zero
08BM1 eaBranch if Minus 1
09BNM1 eaBranch if Not Minus 1
0ABKBreak
0BRSReturn from Subroutine
0CBS eaBranch to Subroutine

It also featured 16 "registers", stored in the zero page. These were specified using the second nibble of the opcode, allowing these instructions to take up only a single byte. Notably, some instructions supported an indirect addressing mode. Many operate on register R0, also called the accumulator.

OpcodeMnemonicFunction
1nSET Rn $xxxxSet a register to an immediate value
2nLD RnTransfer the value in the specified register to R0
3nST RnTransfer the value in R0 to the specified register
4nLD @RnLoad the low-order byte of R0 from the memory address stored in Rn, then increment the value in Rn
5nST @RnStore the low-order byte of R0 into the memory address stored in Rn, then increment the value in Rn
6nLDD @RnLoad a 16-bit little-endian value from the memory location stored in Rn into R0, then increment the value in Rn by 2
7nSTD @RnStore a 16-bit little-endian value from R0 into the memory location stored in Rn, then increment the value in Rn by 2
8nPOP @RnDecrement the value in Rn, then load the low-order byte of R0 from the memory address stored in Rn
9nSTP @RnDecrement the value in Rn, then store the low-order byte of R0 to the memory address stored in Rn
AnADD RnAdd the contents of Rn to R0, and store the result in R0
BnSUB RnSubtract the contents of Rn from R0, and store the result in R0
CnPOPD @RnDecrement the value in Rn by 2, then load the 16-bit little-endian value from R0 into the memory address stored in Rn
DnCPR RnSubtract the value in Rn from R0, and set the status flags for branching
EnINR RnIncrement the contents of Rn
FnDCR RnDecrement the contents of Rn

SWEET16 stands out to me as a shockingly complete interpreted language implemented in a shockingly short amount of code. It's something I otherwise wouldn't have been expected to be feasible in this context.

Illegal instructions and their uses

The 6502 has 151 valid opcodes. However, there are 256 possible values that the first byte of an instruction can take. What happens if you try one of the invalid ones?

  • 12 of them just lock up the CPU and prevent it from executing instructions further
  • 27 of them are no-ops, with varying instruction lengths
  • 1 has non-deterministic behavior ("XAA", which I won't even try to describe here)
  • The rest are perfectly usable, but do some rather strange operations...
MnemonicDescription
ANCAND memory with accumulator then move negative flag to carry flag
ARRAND memory with accumulator then rotate right
ASRAND memory with accumulator then logical shift right
DCPDecrement memory by 1 then compare with accumulator
ISCIncrement memory by 1 then subtract from accumulator, store result in accumulator
LASAND memory with stack pointer, store result in accumulator, X register, and stack pointer
LAXLoad both accumulator and index register from memory
RLARotate memory left then AND with accumulator
RRARotate memory right then add to accumulator
SAXStore accumulator AND X register into memory
SBXSubtract memory from accumulator AND X register, store in X
SHAStore accumulator AND index register AND a value dependent on the addressing mode into memory
SHSAND accumulator with X register, store in stack pointer, then store stack pointer AND high byte of memory address into memory
SHXStore index register X AND upper byte of address plus 1 into memory
SHYStore index register Y AND upper byte of address plus 1 into memory
SLOShift memory left, then store memory OR accumulator into accumulator
SREShift memory right, then store memory XOR accumulator into accumulator

The technical reasons for why this behavior happens are described well in this blog post and this document.

A few of these instructions were used in a few NES games (mostly 2-byte NOPs). These also saw use on various BBC Micro games. Games used these for both copy protection (as an attempt to validate the hardware?) and for performance. My guess is that more serious software doesn't bother with these since performance is less of a concern than correctness.

Later revisions of the 6502, such as the W65C02S, replaced the opcodes in the "F" column with additional bit manipulation opcodes.

The Ricoh 2A03/2A07 and the NES

The 6502 was incredibly popular, and inspired a few clones and derivatives. One such clone was the Ricoh 2A03/2A07 chip used in the Nintendo Entertainment System. Apparently, Nintendo was not a big believer in copyright law back in this era. Oh, how the times have changed...

To see how blatant this was, first look at this image of the 6502 chip (courtesy of Visual 6502)...

Now, look at this image of the Ricoh 2A03 (also from Visual 6502)

Does that little bit in the lower right corner look familiar?

Interestingly, the Binary Coded Decimal (BCD) functionality of the 6502 were fused off before the chip was incorporated into the Ricoh clone. This feature allowed the 6502 to perform arithmetic operations on numbers stored as decimal, with 4 bits representing a decimal digit. It was covered under a patent held by MOS, which is probably why Ricoh disabled it.

The other parts of this chip are for things like the NES's sound generator, known as the "APU" and definitely worthy of its own post someday. It's also why there are two variants of this chip: one for NTSC and one for PAL.

Conclusion

The 6502 is one of the final bastions of an age where microprocessors could be contained in a pile of parchment in a dude named Chuck's desk. It was one of the first mainframes in a chip, and powered countless number of people's first experiences in computing. The number of programmers today who got their start on the 6502 alone makes the chip worldchanging in it of itself.

]]>
https://breq.dev/2025/04/23/6502 /2025/04/23/6502 Wed, 23 Apr 2025 00:00:00 GMT
<![CDATA[A "5V Bypass Mod" for the PI040202-7X2C PCIe Card]]>

The article below was republished from the internal NURover Notion wiki. On the Rover team, we use a PCIe to USB card with four individual USB controllers to provide enough bandwidth for our camera streaming system (something I've posted about previously). As you can probably imagine, the USB system on the rover has very high current draw, and we've fried cards (quite dramatically) in the past.

Some PCIe to USB cards require a modification to pass through a high amount of current on the 5V rail. This page describes how to apply the modification.

Do I need to perform a 5V Bypass Mod?

There are two types of "quad chip" PCIe card available: “B” variant and “C” variant.

image.png

PI040202-7X2B (”B” variant)

image.png

PI040202-7X2C (”C” variant)

This modification is only required for the “C” variant. The “B” variant boards provide power from the +5V pin of the Molex or SATA power connector directly to the USB port (after some LC filtering). The “C” variant boards instead provide power from the +12V pin on the power connector, passed through a voltage regulator chip. This chip cannot handle the high current loads present on the rover.

Once performing this modification, the card can no longer be powered from the PCIe slot and will REQUIRE external power to operate.

Removing the Voltage Regulators

The first step is to remove the 12V to 5V voltage regulator chips. There are two ways to do this.

Hot Air

The least destructive method of removal is the use of a hot air workstation.

  • Set the hot air gun to a temperature of approximately 450 F and turn the fan up.
  • Apply heat to the chip.
  • Once the chip is heated, use a small flathead screwdriver to push it out of position and away from its footprint.
image.png

Notes on this method:

  • Note that if you do not push the chip completely off of the footprint, it may reattach to a different set of pads. Just try again!
  • It is okay if the components surrounding the chip start to move a bit — the soldermask and the surface tension of the solder should keep them roughly in place. Even if knocked off, components are for filtering the input side of the voltage regulator and are thus not needed.

Cutting Tools

image.png
image.png

If the chip is burnt, you can use flush cutters to remove it.

  • Remove the plastic casing of the chip by scraping it off.
  • Using flush cutters, cut the legs of the chip off.

Notes on this method:

  • Make sure to avoid lifting pads! If the pads are lifted from the board too much, they can flop around and touch each other and cause shorts. Trim the pads as far back as possible.
  • This method gives you much less room between the pads of the chip and the side of the inductor you want to solder to, which makes soldering more difficult.

Attaching Jumpers

You want to bridge the +5V input from the Molex connector to the two +5V outputs that would have been coming from the regulator.

Your first instinct may be to solder to the 5V output pad where the regulator used to be. This is possible, but difficult to do by hand. A much easier approach is to solder to the input side of the inductor!

Drop a blob of solder on the side of the inductor. The picture shows slightly too much, I cleaned it up after. Don’t worry too much about making it pretty.

Cut a small length of wire (I used 24 gauge) and strip a bit at the end. Tin the end with more solder.

image.png

Finally, use one hand to push the wire into position and the other hand with the soldering iron to heat the pad. It may take a while for the inductor to get up to temperature — be patient!

The reason we were liberal with solder before is it frees up one of your hands — you don’t need to hold the solder while attaching the wire.

image.png

Repeat the process for the second inductor, attaching a second wire.

image.png

Trim the wires to length — you want them to attach to the +5V pin on the Molex connector (furthest from the PCIe connector on the card). You can line them up perpendicular style like so to attach both easily.

image.png

Finally, add solder!

image.png

Testing for Shorts

There are two shorts that you need to check for the presence for. The GND and 12V pins heading to where the voltage regulator was are very close to the pad on the inductor that you soldered to, and bridging is very possible. You can test this by checking connectivity between your jumper wires and the USB connector shielding (for GND) and between your jumper wires and the output side of diodes D1/D2 and D3/D4 (for 12V).

If neither rail is shorted, your board is ready to use!

Schematic

image.png
]]>
https://breq.dev/2025/04/09/usb-pcie-5v-bypass /2025/04/09/usb-pcie-5v-bypass Wed, 09 Apr 2025 00:00:00 GMT
<![CDATA[KiCAD SVG Prettifier]]>

This project recolors SVG exports of KiCAD circuit board designs to appear in realistic colors.

Motivation

I faced an annoying problem with KiCAD the other day. I wanted to export a realistic-looking image of a PCB design to use for an upcoming project. However, the existing options for doing this didn't fit my needs.

The 3D viewer, while very cool, isn't great for this use case. I would prefer a vector image to a raster one. Plus, with the way the lighting is set up, an "exactly head-on view" of a board can look very blown out. Here's a board with light-colored silkscreen as viewed head-on in the 3D viewer:

KiCAD also has the ability to export an SVG natively, but the result looks like what you would see in the editor: the colors match the editing layers, not the actual realistic colors of the board, and the soldermask layer is inverted. Plus, I wanted a way to generate separate front and back views, with the back view mirrored to how it would appear in real life.

Implementation

So, I built a tool to recolor the output from the KiCAD format to look more realistic. Annoyingly, KiCAD-generated SVGs do not put each layer into a separate SVG group or layer, so my code queries for nodes based on their existing color to apply the new colors. This would break if KiCAD exports things in an unexpected color, but since these colors are seemingly not configurable by the user, this should be fine.

I chose to write this app in JavaScript since the language already provides good utilities for parsing, querying, and manipulating XML documents like SVG files. The UI is pretty utilitarian, but hey, it works and loads quickly! I wrote this without using a framework for styling or JavaScript -- it was nice to take a break from the complex tooling of my usual React/Tailwind setup and go back to basics for a bit.

The whole app is based around some gnarly CSS selectors, especially since the format changes substantially between KiCAD versions and I was trying to target versions 7 through 9.

Here are some example outputs for a board I'm working on:


Try it yourself

If you want to try it out yourself, you can! First, in KiCAD, export an SVG with the relevant layers. In older versions, this is straightforward:

However, in the new version, they've replaced the "Export SVG" dialog with the "Plot" window, which exports a separate SVG file for each layer (which is not what we want!) The easiest way to get a single SVG image is with the KiCAD CLI (replace MyBoard.kicad_pcb with the filename of your board file):

kicad-cli pcb export svg --mode-single --layers F.Cu,B.Cu,F.Silkscreen,B.Silkscreen,F.Mask,B.Mask,Edge.Cuts MyBoard.kicad_pcb

Then, go to kicad-pretty.breq.dev and:

  1. Upload your SVG
  2. Change the colors as desired (e.g. if you use a different soldermask color)
  3. Click "Make Pretty!"
  4. View each preview with the "Show Front" and "Show Back" buttons
  5. Export each side with the "Export" button

If you're having issues with the export format, feel free to reach out to me and I'll do my best to investigate! I've only been able to try a handful of SVG viewers and KiCAD boards so far.

]]>
https://breq.dev/projects/kicad-pretty /projects/kicad-pretty Thu, 03 Apr 2025 00:00:00 GMT
<![CDATA[Who's Using breq.dev?]]>

My friend Jules made a fun suggestion on Bluesky recently:

Let's step through the search results for my domain name on GitHub and see what we can find!

eightyeightthirty.one

In 2023, some friends and I built a scraper tool called eightyeightthirty.one which attempted to map the entire graph of 88x31 buttons. Soon after, I wrote an article applying network science to the 88x31 graph.

A few people credited my project page for the 88x31 scraper tool. Please credit NotNite's portfolio instead, she did most of the work!

5F3759DF

A few people have linked to my 2021 article on the fast inverse square root algorithm in code over the years:

Flowspace

At least one person other than myself has set up a profile on Flowspace, a social network demo I made in 2021.

Accelcoin

In 2022 I was a teaching assistant for Fundamentals of Computer Science 1 at Northeastern. One of our course assignments was for students to build a working blockchain node implementation. I ran some infrastructure relating to this! Some students have published their code to GitHub.

Wordle

In 2022 I published a few implementations of the Wordle algorithm in TypeScript and Rust.

]]>
https://breq.dev/2025/03/10/whos-using-breqdev /2025/03/10/whos-using-breqdev Mon, 10 Mar 2025 00:00:00 GMT
<![CDATA[Reflections on Four Years of Rover]]>

As many of you may know, I've been involved with the Northeastern University Mars Rover Team for four years now. We compete each year in the University Rover Challenge and occasionally the Canadian International Rover Challenge. I personally have competed in four URC events and one CIRC event, and will (hopefully!) be competing at the upcoming URC and CIRC events this summer.

Here's our latest System Acceptance Review video, which gives some context for the team structure and the subsystems we develop.

Rover has largely dominated my college experience. Late nights in the lab have been a staple of all four years of my college life. During my co-op searches, rover has consistently been the only thing any interviewer has ever wanted to talk to me about. While the club has occasionally gotten overwhelming, I have no regrets about pouring as much energy and effort into the team as I have.

This post is a collection of things I learned from my time on the team and reasons why experiences like this are valuable. This section is written from my perspective as a software developer, but the general principles apply across the engineering disciplines.

Thinking across disciplines

It's critical to work with engineers outside the software field, for two reasons: it helps you teach others about software, and it helps you learn about other forms of engineering.

As an early career software developer, you might never work with anyone outside the software field unless you seek those situations out. Even if classes require team assignments, they are typically done exclusively by students majoring in computer science. Furthermore, many internships on large software teams don't involve any communication with those outside the software team. During my 6-month co-op at Amazon Robotics, I can't point to a single issue in which I had to work deeply with someone with expertise outside of embedded software in order to solve a technical problem. That's a shame, since communicating across disciplines is a critical skill for solving problems in the real world.

The other advantage of an experience like the rover team is that it allows you to develop an understanding of fields outside your core strengths. While on rover, I did electrical work, replaced and installed mechanical parts, and even got involved in our filming team for our System Acceptance Review video. My greatest regret on the team is not learning SolidWorks. The base level understanding of electrical and mechanical engineering I developed has come in handy countless times over the years.

Hands-on experience

My university is (somewhat) unique in that it encourages students to participate in 6-month "co-ops" instead of 4-month internships. They pride themselves on this "experiential learning" approach, and one could assume that it diminishes the importance of developing hands-on experience through clubs. However, I would argue the opposite: co-op program participants benefit greatly from hands-on club experience at least as much as those on a traditional internship schedule.

One reason is that club experiences give members a much broader range of fields and topics to study. Co-op projects are often small, self-contained, and confined to a small field and area of the product. In clubs like the rover team, you can choose as many or as few topics as you like to learn about and dive deep into. Clubs like the rover team have countless subsystems and components, both hardware and software, which need to be integrated; with members leaving every four years, there are plenty of opportunities to become the team expert in a particular system.

During my time on the rover team, I became the "expert" on our command and control interface (since rewriting it happened to be my first project), our radio system (since I was in the right place at the right time to troubleshoot it at my first competition), our camera streaming system (since I arranged to take over development on it after the previous experts graduated), and our migration to ROS 2 (since I was in a leadership position at the time we migrated). Unless you work at an early-stage startup, it's difficult to develop this breadth of expertise in a corporate environment.

Similarly, engineering clubs allow members to be part of high-level design decisions and planning steps that are often opaque to interns and co-op students. The design of large-scale software systems is a critical skill often not taught well in classes. You can study design patterns all day long, but putting that knowledge into practice is a skill that takes, well, practice.

Another advantage of club experiences that it teaches you software development practices before you go on an internship or co-op. I am always amazed at the number of extremely smart Northeastern students who have no experience with Git, pull request workflows, how to leave a code review, or other critical steps in the software development process until their first co-op. While Northeastern does have a class which covers these, it is taught too late and in too little detail. While you could make the argument that teaching these skills is the purpose of a co-op, it often doesn't work like that in practice: small companies may employ poor software development practices, large companies like Google or Amazon use their own bespoke workflows unlike others in the industry, and students don't always get the chance to leave code reviews themselves.

Leadership and collaboration

Working on a 4 person team is quite different from working on a 40 person team, and a 40 person team is quite different from a 400 person team. My rover team had around 40 active members at a given time. I think this number is about perfect for a few reasons:

  • It's a large enough group of people that a hierarchical team structure is necessary to be productive. Members need to be split into subteams, subteams need leads, and subteam leads need a team lead. No one person can hold all of the expertise about the system in their head at any given time.
  • It's a small enough group of people that becoming the expert in a subsystem or rising into a leadership position is very possible for all incoming members.

I'm sure I'm far from the first person to point out that personal projects are no substitute for building things with others, but it's true! Going directly from personal work to working 40 hours a week on a team is a jarring transition for many. Although software courses try to remedy this with teamwork assignments, there is no substitute for the experience gained working with a large team over multiple years.

It's fun!

My final reason is that rover development work and competition is fun as hell. If the only software development work I did was for classes or co-op, I worry I would lose interest quickly in the field. Personal projects can be fun to develop as well, but nothing compares to the shared sense of comradery, triumph, and accomplishment I experienced on the rover team.

]]>
https://breq.dev/2025/03/06/rover /2025/03/06/rover Thu, 06 Mar 2025 00:00:00 GMT
<![CDATA[Wall Matrix 2]]>

This project is a wall-mounted LED matrix built to show album art, nearby transit and bikeshare information, and weather data.

Motivation

Almost four years ago, I made the first iteration of a wall-mounted LED matrix project to show weather and transit info, and it's one of the projects I've gotten by far the most use out of. It's been hung at my childhood home in Maine, in various dorm rooms over my college years, and now at my apartment I share with Ava.

Over that time, we've made only a few changes:

  • The display now automatically cycles between weather and transit info (since I like to see both before commuting).
  • Weather info now displays in both Fahrenheit and Celsius (since Ava prefers the latter).
  • Multiple upcoming trains are shown instead of just one.

However, as we've added each of these features, we've needed to pack more information into a very small (only 16x32!) screen. This tips the balance away from legibility (mostly fine for us, but less useful for guests at our house).

Simultaneously to this, the two of us got a record player shortly after moving in and have been starting to build a record collection, which has reinvigorated our appreciation of album art. I thought the idea of a piece of wall art to display album covers sounded cool, so I started looking at getting a variant of the LED matrix with a square aspect ratio.

Technical Description

"Hardware Rev 2" is built around a 64x64 matrix panel with half the pixel pitch of the original, leading to a device with the same width and double the height. I decided to build it with a Raspberry Pi Zero 2W and a Matrix Bonnet from Adafruit with matching dimensions to shrink the electronics down from the original and let it sit closer to the wall.

This also meant that I could build this new revision without taking parts from the old build! I gave the old model to my girlfriend Mia.

Software

I also chose to start from scratch with the software of this version. I've learned a lot about software development since writing the first iteration of the code, and the old design included way too many layers of abstraction. (Features like the preview display and web server were cool, but saw little day-to-day use.)

The display supports four main screens with information pulled from Spotify, weather, MBTA, and BlueBikes.

The Spotify screen just displays the artwork from whichever album Ava or I am listening to at a given moment. While I was initially happy to see Spotify's CDN making a perfectly-sized 64x64 version of the image available, it seems to be too affected by JPEG artifacts to be usable here. So, I settle for downscaling a higher-resolution version.

Spotify data is pulled using the spotipy library.

The weather screen is pretty similar to the weather screen on the old version, showing an icon from Dhole's awesome weather pixel icons alongside the current time, temperature, and high/low temp for the day. (Thankfully, we now have labels corresponding to the F and C temperatures.) Data is sourced from OpenWeatherMap, using a lookup table I made to match their weather condition codes to weather pixel icons.

The MBTA screen is designed to mimic a station countdown clock, but with data pulled from several bus and train lines near my apartment. It merges schedules and realtime predictions from the MBTA Realtime API. I've written some custom logic to try to deduplicate schedule and realtime data based on trip identifiers, but it doesn't always work perfectly.

The BlueBikes screen is probably the one I'm most proud of. I hand-drew each of the icons for bikes, e-bikes, and docks that show for each station.

Bikeshare services publish realtime using the General Bikeshare Feed Specification format (GBFS), analogous to the GTFS format used for transit data. The BlueBikes API has excellent documentation. It's a bit inefficient (there's no way to filter to just a subset of stations that I've found yet), but it works great for this use case.

My friends and I have fun with a website called makea.fish, which generates an image of a fish at 11:11. (We've worked on various other fish generators, too!)

As a fun easter egg, the LED matrix will switch at 11:11 to showing a fish image generated by makea.fish which refreshes every 10 seconds.

Hardware

A view of the back of the device, showing the 3M command strips used to hang it on the wall

Just like the original, this LED matrix sign is contained within a 3D-printed enclosure. It uses a Raspberry Pi Zero 2W (my first project with a Pi Zero!) with Adafruit's Matrix Bonnet, using a DC jack "extender" I soldered to route the power input to the bottom of the device.

The case features a "chin", since the stackup height of the electronics is annoyingly tall. I also added ventilation holes, since there was a bit of discoloration on the wall from the old design. This was my first 3D printing project in a while (outside of my day job working with volumetric printing).

Quite a long time ago (was it middle school??), I bought a Monoprice Select Mini 3D printer. It held up surprisingly well over the years, including through two hotend replacements. This is the printer that the old LED sign case was printed with! It had a print volume of only 120x120x120mm, so I had to break up the piece into two separate parts.

A few years ago, I bought an Ender 3 Pro during a Micro Center sale. (Annoyingly, I was living away from Boston over the summer, so I ended up grabbing it and lugging it around for a full day of sightseeing with my parents -- in my defense, the sale was about to expire!) I set it up and ran a few prints, but could seemingly never get it to work as well as my old Monoprice one. While I used this for the rave choker project, it just didn't work well for printing larger parts. It spent a few years collecting dust at my parents' house.

Then, earlier this year, I decided to take the printer back down to my apartment and try to get it up and running again. I ended up doing almost a full rebuild, and found just two loose screws holding the Z-axis stepper in place (maybe vibrated out, maybe not tightened properly in the first place). After tightening those, the parts for this project came out beautifully!

The Ender 3 is large enough to fit the full size of the LED matrix, but isn't quite big enough to print the entire case in one piece. Thus, I broke the part up into the frame around the LED matrix and the "chin" below which contains the Raspberry Pi. The two parts are attached using heat-set inserts.

And as for why the parts are pink instead of black: after years of being stored badly, the pink spool was one of the only ones I had which wasn't made too brittle by the humidity. (Plus, the pink and black scheme is growing on me...)

Conclusion

I'm really happy with this! It fits nicely into our living space and does its job of both looking unique and providing useful info. I have thought of a few improvements to make, which will maybe happen at some point. There's a lot of empty space in the enclosure right now!

  • Automatic brightness adjustment: The sign easily lights up a dark room. I keep it out in the living room instead of the bedroom, so it's not a big deal, but it'd still be a nice touch.
  • On-device controls: It would be nice to be able to configure the device, change modes, turn on/off, etc. with some interface on the device itself. While the display could be reused for an on-screen menu system, we'd need some form of button for user input. It would definitely need to be minimal (I don't want this to turn into a wall-mounted gameboy). I think an encoder with pushbutton might be a nice touch, since the only thing visible to the outside would be a single knob.
  • Cloud interaction: The old matrix had a website where anyone could enter a message and it would scroll across the screen. I haven't added anything like this to the new code (yet), but it'd be pretty fun! (Maybe it could even show drawings using the increased resolution...)

It also feels quite polished compared to a lot of the quick builds I've done, like I could do a small run of these. The parts costs are a bit high (mostly the LED matrix itself), so that'll probably never happen, but I can dream!

]]>
https://breq.dev/projects/matrix2 /projects/matrix2 Tue, 28 Jan 2025 00:00:00 GMT
<![CDATA[Why Amateur Radio?]]>

The amateur radio hobby is generally dominated by a particular demographic. If you look around at a typical large hamfest, spend any time listening in on a net, or ask anyone for their stories of family members, you'll easily spot it. As a young queer student, my friends and family members are often surprised to hear that I'm involved in the community. So why do I gravitate to it?

Projects

As my software knowledge has matured, many of my recent projects have been more focused on building things that I would find useful in my day-to-day life. Amateur radio is a great opportunity for projects like those. Radio hardware is often very open to hacking and experimentation, open-source custom firmware and custom programming tools are extremely common.

Personally, I enjoy connecting the hobby to my interest in software through projects like rolodex, a "contacts app" for storing callsigns and repeater information, or my work with the Northeastern Mars Rover Team involving autonomous mission control over AX.25.

Adventure and Practicality

When amateur enthusiasts talk about "off-grid communication," it's often in reference to an emergency scenario in which cellular networks fail. While amateur radio does often prove useful in these scenarios, it's infrequent enough that I don't find it particularly exciting.

Another situation where off-grid communication is desirable is in those few areas which are still unreachable by cell signal. I've personally experienced the usefulness of this at the University Rover Challenge, where the site has little to no cell coverage and the range of cheap FRS radios is often inadequate. I've also tuned my handheld radio to weather frequencies to get forecast information in remote areas of Utah and New Mexico.

Even in situations where communicating over cell phones is possible, radio can still pose advantages. It's much easier to communicate via radio while driving a car, while outside with gloves on, or in other situations where reaching for a phone is inconvenient.

Train autism

One of my favorite things to use my handheld radio for is scanning: listening to signals on railroad channels, marine channels, and other government or commercial bands. Listening to how radio is used in the world around me illuminates systems and processes like train dispatch, ship-to-ship communication, and more that are otherwise hidden from view. Discovering these frequencies is an interesting and rewarding challenge in and of itself, and seeing these processes firsthand shows so much about the infrastructure that makes our world run.

Friends

Amateur radio is a hobby more or less predicated on talking to people, so it's critical to find a group of people you enjoy talking to. Radio communication takes a few forms:

  • planned contacts between two or more stations (e.g. me talking to my friends)
  • nets, often hosted on a repeater (for VHF/UHF) or specific frequency (for HF) at a specific time
  • contesting, usually on HF, in which stations try to contact as many other stations as possible within a specific timespan (usually a day)
  • DXing (where DX is slang for "distance"), also usually on HF, which usually involves making contacts with distant radio status and can involve digital protocols such as FT8

HF equipment is more expensive and generally incompatible with apartment life, which means I'm stuck with a handheld for personal use. That said, I've done some contesting with Northeastern University Wireless Club and my friend Philo is an avid FT8 enjoyer -- it's possible!

Regardless, this leaves me with nets as one of my only options. While I've tuned into a few on repeaters around Boston, I generally am not interested in the conversation that goes on there. (I can only take so much listening to boomers talk about their medical problems.)

Despite how it may seem at first glance, there are a lot of young people in the hobby! Younger radio operators tend to form much more insular groups with their friends or at their university or hackerspace. A good place to spot the younger crowd is often on university-run repeaters -- around Boston, good options include W1KBN or W1XM.

If not for my close friends in the hobby, I would likely have nowhere near the level of involvement that I do now. We often use radios as an interesting and unique way to coordinate adventures together. My friend Ari created her own bandplan of simplex frequencies that we often use to communicate.

Radio can be a daunting hobby to get into due to its social nature. Finding your people is critical to having a good experience -- I had a handheld collecting dust on my desk for about a year before I really started using it. The first step in the hobby for many is becoming licensed, which often means working up the courage to show up to an unfamiliar place and take an exam administered by an unfamiliar group of people. Part of the reason I recently became a Volunteer Examiner is to help reduce this barrier to entry among friends, and to give me a better answer when someone I know asks how to get involved.

As far as hobbies go, amateur radio has a lot to offer, and not fitting the stereotypical demographic shouldn't hold you back from carving your own niche in the world of radio communications.

]]>
https://breq.dev/2025/01/05/amateur-radio /2025/01/05/amateur-radio Sun, 05 Jan 2025 00:00:00 GMT
<![CDATA[My Home Network Lab Notebook]]>

Some of you might know that Ava and I recently moved to our new home in Somerville! As part of this, I finally got the space to start having fun in building a home network setup.

While the setup is pretty tame for now (and Ava has specifically banned me from purchasing any rackmount gear -- alas...), I hope to keep this post updated with notes I have as I work to build more.

Core Equipment

Modem

We get internet through a cable provider, so the network starts with a modem. I picked a basic Arris SURFboard with DOCSIS 3.2 support since I wanted the flexibility afforded by running a separate router.

This modem operates a little strangely: it gives itself 192.168.100.1, hands out 192.168.100.10 to the first device it sees on DHCP, then also forwards along the assigned public IP address to the connected device. While waiting for the pfSense router to arrive, we actually had to use a Gl.inet router since neither the modem nor our access point has no routing functionality.

Router and Switching

My router is a small Netgate box running pfSense that was gifted to me by my friend Ari. It has two LAN ports: one goes to the network switches, and the other goes to the Wi-Fi access point. This lets the router and access point use VLANs to separate out clients on each SSID.

I keep the travel router I was using in the interim around for any networking shenanigans that might require it.

I have two Netgear switches I picked up at MIT Swapfest a while ago: one sits near the rest of the equipment on my shelf, and the other sits on my desk to connect to my desktop and oscilloscope. One of these switches was featured in my friend Hunter's Devices of All Time blog post, which I somehow didn't realize until months later :)

Wi-Fi

After dealing with tons of Wi-Fi dropouts in our old apartment, I decided to invest in a Ubiquiti U6 Mesh access point. It's definitely overkill, but we get exceptional connection to anywhere in the apartment.

I was very excited to finally have a router that didn't look like an over-the-top gaming device. The U6 Mesh is very compact and fits nicely on a shelf -- it's about the size of a tall seltzer can. The PoE power is actually quite useful in a home environment -- the cable run is a lot tidier with only a single cable to the device. The only criticism I have so far is that gets pretty warm when it's running.

While it's theoretically possible to set up the device without a UniFi Controller, I couldn't figure out how to do so (the app kept crashing when I tried to go through the flow). So, I just installed the UniFi server software onto my desktop and ran it so I could configure the device, then closed out of it. The software allows creating any number of separate SSIDs, which we use to run our primary network, an SSID matching my parents' home in Maine, and an Eduroam hotspot.

Layer 3

Our IPv4 subnet runs on 10.0.0.0/24, mostly since it makes it possible to use shorthands like 10.2 instead of 10.0.0.2.

Our also ISP provides us IPv6 - yay!

I use pfSense's Dynamic DNS tools to point home.breq.dev at our IPv4 address. I couldn't get this to work for IPv6 (the router would use its own address, where I wanted to direct traffic to another point on the network). So I put our IPv6 prefix directly into the DNS dialog on Cloudflare and hoped for the best.

IPv6 Addressing

Getting the "suffix" of my IPv6 address to remain static ended up being a hassle -- I wanted my computer to continue to respect the prefix it was given, but to keep the same suffix part instead of randomly generating it.

I found ip token to solve this problem.

To try things out, I ran:

# Accept Router Advertisements to configure the network prefix
sudo sysctl -w net.ipv6.conf.enp37s0.accept_ra=1
# Set an IP token identifier (::bc for my initials)
sudo ip token set ::bc dev enp37s0

Then, to make it persistent:

# add net.ipv6.conf.all.accept_ra=1 in the ipv6 section
sudo nano /etc/sysctl.conf
# NetworkManager equivalent of the ip command
# My wired network is called "Ethernet",
# or by default it's something like "Wired Connection 1"
sudo nmcli connection modify Ethernet ipv6.addr-gen-mode eui64
sudo nmcli connection modify Ethernet ipv6.token ::bc

Split DNS

For IPv4, since we do NAT, I needed to use Split DNS to make sure services were available both on and off the network. (And since the DNS resolver built into pfSense was synthesizing records for these addresses, I also needed to have it synthesize AAAA records accordingly too.)

Services

We've got a few services hosted here, with more potentially to come:

  • home.breq.dev, largely just a landing page for now hosted with Caddy from my desktop
  • a Nextcloud instance with our home files
  • the LED matrix, running through a Cloudflare Tunnel for now

In the future, I'd love to also spin up

  • a "home node" for syncing my files with Syncthing
  • a permanent place for the UniFi controller software other than my Ubuntu desktop
  • a VPN host, for remote maintenance, getting around firewalls, and maybe hooking up my friends' networks and mine to make a mega home network

Eventually, I'd love to have a dedicated server for lots of these things (Nextcloud, Syncthing, etc) and maybe a reverse proxy to point to other devices on the network? It's not a huge priority, but maybe after I upgrade my desktop machine I'll be able to scrounge together some hardware.

]]>
https://breq.dev/2024/10/22/home-network /2024/10/22/home-network Tue, 22 Oct 2024 00:00:00 GMT
<![CDATA[Make a FiSSH]]>

fissh.breq.dev ("make a fissh") is a terminal-based fish generator that you access via SSH terminal. At 11:11, try running:

ssh fissh.breq.dev

and you'll see a fish appear in your terminal!

Fish generators

My friend Ari introduced me to the original makea.fish site (HTTP only), made by weepingwitch. The site embeds an image rendered with PHP showing either a randomly-generated fish or the phrase "come back at 11:11." Getting the 11:11 fish has become somewhat of a ritual among my friends (we have a Discord channel specifically for posting fish screenshots!)

Over the last few months, we've made a few fun variations on this:

"Make a fissh" was written by myself and my girlfriend Ava based on an idea from my girlfriend Mia. It's our contribution to the weird and wonderful world of fish generators.

Wish, bubble tea, and lip gloss

I got the idea for an SSH-based application from terminal.shop (an SSH-based coffee beans store). My first idea was to create a user whose login "shell" is just the fish application, similar to how git-shell only allows running Git commands. However, I couldn't find a way to make OpenSSH require neither a password nor a public key. I suppose there's definitely a good reason for that, but it meant I needed to look elsewhere.

Looking into how terminal.shop implemented their solution, I stumbled upon Wish, a Go library designed for "SSH apps." I decided to build this within the Charm ecosystem, using Wish, Bubble Tea for layout, and Lip Gloss for "styling" (i.e., ANSI escape codes).

This stack imposed some requirements on the project structure: this was probably my first model-view-controller app since taking Fundamentals of CS 2 and I definitely would not have picked Go first. The MVC architecture works fine for something like this (although I would've preferred something a bit more component-based / React-y). I found the docs for the Charm projects pretty lackluster, and was often faced with undocumented behavior or limitations that I couldn't find reference of online. (I probably should've realized that the number of GitHub stars probably didn't correlate to the amount of actual use that the library gets...)

I won't pretend to have insightful commentary on Go after using it for a tiny toy project, but I honestly struggle to see myself reaching for it in any situation. My first impression is that it's the complexity of Rust without any of the error handling, memory safety, or performance benefits, with more than its fair share of strange syntax or stdlib quirks, coupled with tooling that makes strange decisions at times.

That said, for a toy project, I got a huge amount of learning out of this, so I can't really complain :)

A fishy dataset

With the stack out of the way, it's time to actually build this thing. The most important part of a fish generator is the fish it generates. I decided to go the easy route and source fish ASCII art from ascii.co.uk. One thing I like is that most artists leave their initials somewhere in the drawing. I wish I could link to some of the ASCII artists directly from the fish page, but unfortunately, the site doesn't give any other attribution.

Finishing touches

Getting the user's timezone right is another integral part of fish generation. Web-based generators get this easily from the JavaScript Date API, but for other protocols, it can take some creativity. The aforementioned SSTV fish telephone line, for instance, just accepts any time ending in :11.

With a terminal, though, we get a bit more information: the user's IP address. The app uses ipinfo.io as a geolocation database since their free tier seemed generous and they helpfully provide a timezone key in their output. As such, the server can ensure each user gets the fish in their corresponding timezone (assuming the GeoIP database is accurate).

No app would be complete without a short "about" page, let alone one with so many friends to credit. I took advantage of OSC 8 escape sequences to embed links to the fish dataset and various folks' websites. These work in many popular terminal emulators, but not all, and there's no method to query if a terminal supports OSC 8 and fall back to including full URLs otherwise.

Deployment

Deploying something that runs on port 22 on a server that you presumably also need to SSH into is an annoying situation, and I've had enough bad experiences messing up an SSH config and locking myself out of a VPS that I decided to just spin up a new one for this. (Oracle Cloud Free Tier my beloved!) That said, switching OpenSSH to port 2222 and running this on port 22 largely went off without a hitch.

]]>
https://breq.dev/projects/fissh /projects/fissh Wed, 25 Sep 2024 00:00:00 GMT
<![CDATA[Why You Should Write an Emulator]]>

One of my favorite quotes is part of The Cult of Done Manifesto by Bre Pettis and Kio Stark:

Those without dirty hands are wrong. Doing something makes you right.

So do it. Go write an emulator. Pick a programming language. Pick a basic CPU architecture: either something retro like the 6502 or Z80, or something modern but still simple like the RISC-V base integer instruction set, or something designed for teaching like Pep/9. Implement a basic model of its registers and a simple block memory interface. Start with a few opcodes, write some assembly programs to test them out, and then add more.

By the time I sat down and wrote an emulator, I had taken courses on the architecture of a CPU! And yet writing this one piece of software fundamentally changed my understanding of how computers work at a basic level. Computers are obviously far more complicated than the toy examples I'm suggesting, but advances like branch prediction and caching did not fall out of a coconut tree; they exist in the context of all that came before them.

Concepts like Return-Oriented Programming require a strong ability to reason about how machine code is executed. Working with computers at the lowest level through emulator development helped me develop this reasoning ability far better than a class would.

Another underrated skill in software development (that I especially think isn't taught well!) is reading a datasheet and translating that to a software implementation. Writing software in the real world involves working with lots of weird protocols with documentation of varying quality: on Rover I work with tons of random sensors and modules, at work I write code that communicates with lasers with protocols that vary widely between "very reasonable" to "moderately insane," and in personal projects I frequently find myself reaching to implement communication with a part that lacks a comprehensive library. While implementing my emulator, I reached for datasheets, schematics, and any documentation I could find. And where the docs weren't clear, I was forced to find ways to answer questions about the protocol experimentally.

When I started, it felt like almost all resources I found on emulator development tried to talk me out of it:

Good knowledge of the chosen language is an absolute necessity for writing a working emulator, as it is quite complex project, and your code should be optimized to run as fast as possible. Computer emulation is definitely not one of the projects on which you learn a programming language.

- "How To Write a Computer Emulator" by Marat Fayzullin

Writing an emulator was my first project in the Rust programming language, and I don't regret it at all. An emulator forces you to think deeply about your code's structure: in the real world, retro computers reused chips for different purposes, split functionality across different parts of the system, and were generally full of leaky abstractions. How do you write a program which encapsulates that structure? Mirroring the structure exactly a la MAME leads to convoluted code, but building too many abstractions leads to quite complex data flow. And if you're anywhere near as much of a perfectionist as me, you'll rewrite each component at least three times, agonizing over the small details and using every tool your programming language offers you. And each time you rewrite it, you'll learn something.

It is my belief that there is no single project which introduces you to the good, the bad, and the ugly aspects of a programming language than an emulator.

In classes, computer science assignments are typically rigidly defined (you're given an exact set of functionality to implement) and small-scope (you write code once and never touch it again). There's great value in just sitting down by yourself a few times a week and building something: not knowing what exactly you're making nor what "done" even looks like. Being self-guided is an incredibly useful skill: in the real world, you won't have assignment descriptions or TAs to guide you to a solution, and there's often no exact description of what a solution looks like. Going from nothing to a functional, complex emulator shows mastery over this skill.

Large-scope projects are also fundamentally different to work on. While I'm all for building small, toy projects to learn technologies or solve simple problems, they don't help you learn how to structure a program. While group work can help, I'd make the case that it's better to learn design patterns in a solo project since you'll never encounter a pattern you aren't comfortable with. An emulator is usually a large enough project that you can't fit the entire thing in your head at once, meaning you'll need to rely on good design to help you navigate it.

Realistically, the code you're writing probably looks nothing like an emulator. Maybe you're doing frontend development, or writing a backend CRUD app, or even doing some embedded work. That said, I hope I've made the case for how writing an emulator is an immensely useful exercise regardless of the flavor of software development you do.

]]>
https://breq.dev/2024/09/14/emulator /2024/09/14/emulator Sat, 14 Sep 2024 00:00:00 GMT
<![CDATA[88x31 Dungeon]]>

88x31 Dungeon is a set of "games" that allow you to traverse the graph of 88x31 buttons, which are small, pixel art buttons that website owners include to link to related people and projects. It's based on data collected by eightyeightthirty.one, a scraper which originally mapped the full 88x31 network. The graph is used to generate a "dungeon" grid, with each room containing a particular website. This project can be thought of as an alternate way to visualize and interact with the graph created by eightyeightthirty.one.

Experiences

88x31 Dungeon provides a set of related experiences, each providing a different way to interact with the graph.

Birds Eye

The birds-eye view shows the full grid and allows the user to scroll across the map. This was the first and simplest experience I implemented, and is useful mostly to visualize the layout of the dungeon. Each square renders the site's 88x31 and domain name, and it's possible to "fly" to a specific site.

Navigate embeds each site with an <iframe> and allows you to visit its neighbors using arrow keys. It's loosely inspired by the "dungeon crawler" format, but the arrow buttons at the top let you see the domain of the site you're about to navigate to. It's almost like a two-dimensional webring. I find it fun to move quickly through this one, getting a feel for each region of the dungeon by seeing each site for a few seconds.

Walk

Walk presents a draggable canvas on which rooms of the dungeon are embedded as <iframe>s. The buttons at the left allow you to switch between "drag mode" and "interact mode," where the latter allows you to scroll, click, and interact with each website on the grid. Unlike the other frame-based experiences, Walk shows up to nine frames at once, letting you see each site in the context of its neighbors.

A limitation of Walk is that each site is rendered in a relatively small viewport, breaking some sites. This is particularly an issue on mobile browsers where the viewport is already very small.

Crawl

Crawl is the most similar experience to a text-based dungeon crawler. At the top, the site is embedded in a 4:3 <iframe>, and at the bottom, the user can run commands in a terminal.

Crawl provides the following commands:

  • go [dir]: Move in a direction (north, southeast, etc)
  • look: Look around (prints the domains of the rooms that border the current room)
  • fly [domain]: Teleport to a specific room

Technical Description

This project consists of two parts: the code to generate the dungeon, and the frontend experiences.

Dungeon Layout Generation

A full view of the generated dungeon layout.

The dungeon is generated using a breadth-first greedy algorithm, which is roughly defined as:

  1. Start by placing breq.dev at (0, 0) and adding its neighbor cells to the queue
  2. Pull a grid cell from the queue
  3. Find domains with links (in either direction) to the rooms neighboring this cell
  4. Sort the candidate domains by the number of unique links
  5. If no candidates exist, return to step 2
  6. Place the best candidate in a room at this cell
  7. Add the neighbors of the newly placed cell to the queue
  8. Return to step 2

I'm happy with how this algorithm creates neighborhoods, and it places a good portion of the total graph. However, it struggles with cliques, since each node can have at most 4 neighbors.

Putting Sites in Frames

Each experience is implemented differently, but most required embedding untrusted sites using <iframe> tags. Iframes are a relic of the old web, and honestly probably would not become a web standard if they were proposed within the last few years. A few issues I ran into included:

React reusing frame elements

Previously, some parts of the implementation looked like this:

return <iframe src={`https://${room.domain}`} />;

The subtle bug here that creates undesired behavior is that when room changes, the iframe element is reused instead of recreated, so the old frame contents persist while the new ones are loaded. This leads to a confusing user experience, especially in Walk where frame positions are constantly changing positions and sources.

React provides a key parameter for lists to tell what elements to create, destroy, or reuse. The best solution I found in this case is to create a 1-element list and use this:

return <>{[<iframe src={`https://${room.domain}`} key={room.domain} />]}</>;

If you scroll for any appreciable amount of time, you're likely to find either the "page crashed" icon (in Chrome) or a "Firefox can't display this page" error (in Firefox). Unfortunately, many targets of 88x31 links are dead pages. While I'd love to show a fallback UI containing the 88x31 button itself in these cases, IFrames don't send error events since it could be used to "probe the URL space of the local network's HTTP servers." Since there are many other ways to probe the URL space of the local network's HTTP servers from a website, I'm not sure what the merit of this policy is, but as web developers we must accept that we live and die at the whim of browser makers.

Amazingly, someone found a batshit insane way to solve this problem, but I'm hesitant to use a brittle solution like this.

Sites setting the X-Frame-Options header

A lot of sites set the X-Frame-Options header (or its modern alternative, the Content Security Policy directive frame-ancestors) to prevent untrusted websites from loading them as frames.

There is a legitimate security argument for some sites due to an attack called Clickjacking, in which trusted content is loaded into an <iframe> and buttons, input fields, or other interactive elements are overlaid on top of the frame to capture input. However, this isn't applicable to most simple personal websites.

While it might be possible to build and use a proxy which strips these headers from the site, doing so would violate the spirit of this header: if someone set this header, they would probably prefer their site not be embedded into other sites, and I don't want to override their preference. (I'd still really prefer to show a fallback... alas.)

Sandboxing frames

During early development, I found a few sites acting annoyingly, including one that would navigate the top-level window away (what the hell?). While I'd love it if everyone on the indie web played nice, I did implement some protections using the sandbox attribute.

I added the following flags:

  • allow-scripts, since JS in frames is mostly harmless
  • allow-same-origin, since leaving this out interferes with many websites' ability to load fonts and other resources

This still lets frames cause problems (lagging the user's computer, autoplaying annoying audio, etc), but the worst of the problems have been dealt with. Plus, this project deciding what's annoying and restricting features of iframes runs counter to the spirit of showcasing the variety present in the indie web. As such, this approach strikes a balance between maintaining the availability of the site and allowing for creative expression.

Conclusion

While some parts are a little clunky, the intent behind this was less a polished end result and more an experiment into designing interesting and fun experiences which are part game and part visualization tool. I'm happy with the variety present in the end result. Undoubtedly I think I'm going to end up finding more ideas for experiences, and I'm excited to explore this idea further.

]]>
https://breq.dev/projects/dungeon /projects/dungeon Sat, 07 Sep 2024 00:00:00 GMT
<![CDATA[Thoughts on Cities, Navigation, and Home]]>

To paraphrase a close friend of mine:

Boston is like a Grand Theft Auto map. There's a lot to do, but it never really feels all that spread out.

I moved to Boston almost three years ago from today. What I love most about this city is how accessible it is. Here are, roughly chronologically, the ways I've picked up to navigate here, and my reflections on city life.

The Subway

By far the most obvious answer to "how do you get around?" is the subway. Despite distinctly remembering a CharlieCard shortage for around a month after I arrived in the city, I quickly started using the system to explore.

The subway gave me direct access to downtown, and somewhat efficient access to many other places around the city. But using it also feels kind of like "fast travel" in a video game: you hit a button somewhere, wait for the map to load in, and boom, you're somewhere else, bypassing all of the beautiful level design in between.

Walking

When I first moved here (and didn't have many friends), I would sometimes pass the time by just stepping outside, pointing myself in a random direction, and walking. When I would get tired, I would start looking for a T stop or just turn around and head home.

Shortly after, I started challenging myself to walk places without looking at any map: learning street names, orienting myself based on the skyscrapers and cranes overhead, and taking plenty of wrong turns in the process.

I love the chaotic street network of Boston, and how it always gives the sense that there's more to explore.

Walking puts me on the same scale as the world around me. I can look into any storefront, stop to peek down side streets, or take in the scenery around me. To this day I'll still happily walk impractical distances from time to time for no reason other than to take in my surroundings.

The Bus

It took me about a year before I started taking MBTA buses. Many friends I've talked to share the same experience. I'm not sure if it's because the bus network is harder to understand, buses are the more "looked down upon" form of transit, or if those who are new to city life, like my past self, are just less aware of the utility of local bus routes.

I've found the social dynamics on buses to be quite different to the subway. You'll see tons of people piling into the back doors on the Green Line to avoid paying the fare, but you almost never see that on buses. On the bus, there's a much stronger sense of "we're all in this together."

Buses also let me expand my radius of places I can explore. A few of my friends live in places vaguely accessible by infrequent bus routes, and while we'll usually opt for the more practical option of meeting at a subway line terminus, it's fun to occasionally take a long, winding transit journey.

Commuter Rail

While the most expensive transit mode on this list, the commuter rail is hugely useful both for going beyond the confines of the city and quickly getting to and from downtown (provided you time it right). Honestly, it's one of the modes I wish I took advantage of more.

Also: the $10 weekend passes are an insanely good deal. Use them.

BlueBikes

I didn't start biking in the city until recently, but I immediately fell in love with it. Biking immediately felt like the "just right" scale: when riding, you can see every detail of every street, you have full autonomy to stop and explore and detour to your heart's content, but you also get places fast.

When I was young, I used to watch those Casey Neistat vlogs where he lane-splits through NYC traffic on a skateboard, and I always wished I could experience a feeling like that. Biking gives me that excitement: navigating complex situations at high speed, finding miniscule ways to shave time off of trips, and playing my small part in the chaotic yet coordinated motion of traffic.

I expected biking in Boston to be scary, mostly because people tend to make it out to be. But coming from narrow, rural, two-lane roads with large trucks whizzing by at highway speeds, the city feels like a cakewalk by comparison. Sure, you're navigating way more interactions with cars than you would be otherwise, but you don't get nearly the speed differentials of somewhere more rural.

That said, if you take anything away from this post, wear a helmet. One of my most "I can't believe I didn't buy this sooner" purchases was a nice folding helmet that fits in my backpack, so I'm ready to ride safely whenever.

Cars

My girlfriend and I are looking to buy a car soon. While I'm excited to have easier access to places beyond the city, I'm simultaneously nervous that the convenience of car trips will make me lose out on the connection to the city that life without a car has allowed me to build over the past three years.

To end on an anecdote: Once, after a robotics competition in rural Utah, a close friend of mine picked me up to drive me to her home in New Mexico. Sitting on the center console was a notebook on which she had written the numbers: 70, 191, 491, 160, 84 -- the highway numbers we followed on our drive back.

I asked about the notebook, and she said it she was trying to use Google Maps less since she felt it made her less connected to the areas she drove through: instead of looking out the window to understand and be present in the places you drive through, it encourages you to just follow the arrows, ignoring the world around you.

Whenever I navigate a place by memory or intuition, be that the chaotic streets of Boston, the sprawling MBTA bus network, the back roads of small towns in Maine, or the Albuquerque airport I hadn't been through in a year, I feel, in a way, welcomed by the communities I pass through. I feel nostalgic for the times I passed through those same places before. I feel excited to explore, knowing that wrong turns can be less of an annoyance, more of an adventure. I feel grounded, able to anchor my experiences to the world happening around me as I traverse through it. Above all, I feel content, no longer stressed by the arrow on my phone screen, just present in the here and now.

]]>
https://breq.dev/2024/08/15/boston /2024/08/15/boston Thu, 15 Aug 2024 00:00:00 GMT
<![CDATA[Rolodex]]>

My own card as it appears in my account.

Overview

Rolodex stores a virtual contact card with a callsign, name, and other metadata for each of your amateur radio friends. Cards can be viewed, created, and updated on a desktop or mobile device.

Motivation

My friends and I often use local amateur radio repeaters to communicate, so I often need to call one of them by callsign. I didn't feel like I had a centralized place to store this information: a text note would be difficult to navigate and annoying to sync between devices, a traditional contacts app lacks an appropriate field for callsign or DMR information, and I wanted something that I could search by callsign in case I can't quite remember who holds a particular call. The resulting app is a single place where I can track all of my friends' calls, and is something I can easily reference from my phone in the field.

I chose the name "Rolodex" because it conveyed the purpose of the app (storing contacts), and it sounded unique, functional, and a little playful.

Technical Description

The frontend uses the stack I tend to reach for for frontend work: Vite, TypeScript, React, and Tailwind.

Cards use a fixed aspect ratio to follow the ID-1 spec. I used the morse code version of the callsign as an artsy divider between the callsign and name -- this uses a specific morse code font file. The interface uses a mix of DIN and JetBrains Mono, two fonts I haven't worked with much before, since I felt they worked together well and gave each card a utilitarian look without being off-putting.

The "rolling" animation in the column view works by checking where each card is in the scroll view with a JavaScript event, then applying rotation in X and translation in Y and Z. This took me the most effort to get right, and I'm probably going to continue tweaking it to improve its behavior across various screen sizes.

The backend of the app is entirely done in Firebase. This was my first Firebase project, and I figured it was a good fit since the requirements are quite standard (it needs a simple authentication system and a basic way for each user to store and retrieve a small amount of data).

The app is a Progressive Web App, allowing me to install it on the home screen of my phone.

Results

You can try it at rolodex.breq.dev! At the time of writing, we have 5 total users, including myself! Personally I have more than a dozen contacts saved already, and am constantly adding more. I've already found it quite useful, and am looking forward to using it more as more of my friends get licensed and have callsigns issued to them.

]]>
https://breq.dev/projects/rolodex /projects/rolodex Wed, 03 Jul 2024 00:00:00 GMT
<![CDATA[My Dream Handheld Computer]]>

Nostalgia

When I was younger, I had a PocketCHIP handheld that I would carry around. It's what I wrote the MakerGamer fantasy console for, and it's the device I first started experimenting with batman-adv mesh networking on.

I loved being able to have a Linux box with me that I could use for experimentation. The keyboard sucked, and the interface wasn't amazing on a tiny-resolution touchscreen, but that didn't matter for quick usage, and it was still miles ahead of pulling up Termux on my phone.

Part of what got me thinking about this is seeing a few of my friends pick up Flipper Zero devices. These have that quality of being a device for "real-world" tinkering and exploration, encouraging people to interact with systems in the world around them. The main reason I don't have one is just that I'm not particularly interested in RFID/NFC applications (but I applaud those who are).

About a year ago I picked up a cheap Baofeng handheld radio for amateur use. While it's a practical way to communicate with friends on a short range, I've found the most fun part is using it to explore the world around me -- finding repeaters and trying to communicate with them, listening on railroad or marine channels, and generally exploring to see what systems out there that I can observe. (See codeplug for how I've iterated on this process over time).

I want a device which combines the convenience of a Baofeng, the customizability of a Flipper, and the power and peripherals of a PocketCHIP. But more broadly, I want something that encourages that sort of exploration with a physical element to it.

There was a time recently when I was sitting out in a park downtown, poking at some interesting stuff accessible over Wi-Fi that I'm not quite ready to talk publicly about yet. While I was doing this on my laptop, doing so felt odd given all I needed was a basic environment where I could configure my Wi-Fi settings and invoke SSH. A dedicated handheld device would have been just as functional and a bit more fun.

My Dream Handheld Computer

Here's a rough list of requirements that I'd look for in such a device. This is based on my own interests and the areas of tech I tend to explore most.

  • Main Processor: I want something that runs a proper Desktop Linux distro (not Busybox etc), and is powerful enough to run at least a basic Web browser. This probably means a Raspberry Pi compute module or similar.
  • Battery: At least enough for a full day of running around and using it intermittently.
  • Keyboard: Full QWERTY. Good enough to use the CLI with.
  • Screen: This doesn't need to be good enough to display something like GNOME well, but should at least allow for basic GUI apps and comfortable terminal use.
  • Hardware Connections:
    • A full RJ45 Gigabit Ethernet port. I've had so many cases where I just want to get on a network quickly and run an SSH command, and breaking out my laptop plus dongle and messing with the macOS network settings is always painfully clunky.
    • USB-C (as an Ethernet gadget, etc). For longer development sessions, I'll want to connect a proper laptop.
  • Radios:
    • Wi-Fi is a necessity in today's world, Bluetooth would be a nice-to-have.
    • NFC capabilities would be cool for people who are into that sort of thing.
    • I'd love some sort of amateur or otherwise "unconventional" radio capabilities, to allow these devices to be used for communication. While drop-in modules for the 2m band are more rare, lots of them exist for the 70cm (430 MHz, licensed) and 33cm (900 MHz, unlicensed) bands. 430 MHz would likely give better range/performance, while 900 MHz gives interoperability with the Meshtastic network.
    • I'd also like a proper external antenna for the above radio, like you'd find on an amateur HT.
  • Extensibility: Adding external exposed GPIOs would be cool, but also, I haven't seen many people taking advantage of them with the Flipper Zero or PocketCHIP. Connectors like Adafruit STEMMA might be more useful in practice.

Existing Products

Dead Stock PocketCHIP

Up until recently, the website pocketchip.co has sold old PocketCHIPs (and still sells CHIPs, albeit at 4x the original selling price). Even if they were still available, though, the lack of up-to-date software for them makes common tasks like flashing a hassle.

Clockwork uConsole

The Clockwork uConsole is probably the closest thing I've found to the device I want, but since it's designed more for "fantasy console" use, it falls short in a few ways:

  • Available ports are limited. While it has a USB port for host and device use, it lacks a physical Ethernet port :(
  • It features an extension module interface, but currently the only such module is an LTE modem (probably not that useful to me since I could just use my smartphone hotspot). Maybe a module for something like LoRa could be designed using the same connector, with an external antenna poking the side?

It's also still limited to pre-order.

Whatever arturo182 Is Cooking

Solder Party has previously sold a "Keyboard FeatherWing" containing an LCD, QWERTY keyboard of the type you'd find on a Blackberry phone, and a place for an Adafruit Feather board. While a Feather doesn't provide quite the amount of computing power I'd like for such a device, I love the form factor and the pricing is reasonable.

Rooted Android Devices

What about carrying around rooted Android phone for something like this, or rooting my daily driver? This would give an unrestricted Linux environment to play around with, but it also falls short of what I'm after:

  • Poor connectivity: no RJ45, spotty USB host support.
  • Not quite "Desktop Linux-y" enough to run apps designed for typical Linux distros.
  • Using a CLI with a touchscreen keyboard is painful.
  • No real extensibility for something like LoRa.

Designing My Own?

While I'm relatively comfortable with basic PCB design, this is something that's likely way beyond my skill level. Plus, I'd ideally like to have something that can have a community form around it (like PocketCHIP did, or like Flipper Zero does now). There's also the enclosure to consider -- PocketCHIP did this relatively well and the Flipper does it perfectly.

I'm largely ruling this out unless I suddenly meet lots of people who share this vision but have much stronger electrical/mechanical skills than I do.

Something Else?

If you can think of something that fits in this general category/vibe of product, please let me know about it! Even if it isn't what I'm personally looking for, I really want to learn more about projects in this space.

]]>
https://breq.dev/2024/06/21/handheld /2024/06/21/handheld Fri, 21 Jun 2024 00:00:00 GMT
<![CDATA[Codeplug]]>

The codeplug generation tool in action.

Overview

Codeplug is a tool to automatically generate a configuration file (or colloquially, a "codeplug") for a handheld amateur radio.

The tool prompts users to select their geographic regions of interest and to choose which common simplex channels on amateur and other bands to include (such as calling frequencies, FRS channels, etc.), and builds a .csv file containing those channels and channels for radio repeaters in their selected regions.

Motivation

Many of my friends are licensed amateur radio operators, and when we travel together, we often exchange notes on which repeaters to program into our radios. However, as I've made more extensive use of my radio, my previous strategy of including every repeater I can think of in my codeplug has caused me to hit the channel limit.

Ari and I had the idea for a program which can automatically generate a suitable codeplug based on a subset of regions, channels, etc., allowing us to easily program in relevant channels to our radio based on our current interests, upcoming trips, and more.

Technical Description

The tool is a CLI app built with Inquirer for Python. It loads repeater and channel definitions from .yaml and .csv files, and assembles a .csv which can be imported into CHIRP, a common tool for editing and uploading these configs.

Results

Instead of continuing to maintain my codeplug by hand, I'll be using and improving this tool going forward. A few of my friends have already offered to contribute repeater information to it, and I'm planning on extending it to include AAR channels by region as well.

]]>
https://breq.dev/projects/codeplug /projects/codeplug Thu, 13 Jun 2024 00:00:00 GMT
<![CDATA[DDS Tuning for ROS 2]]>

Picture this: you're developing a robot that you aim to control using a nice, clean user interface. Control goes through this UI on one laptop over a network connection to the robot, which runs its own SBC, all using standard ROS 2. You do a ton of testing, at first just with an Ethernet cable since you don't feel like dragging your radios out of the closet, and everything looks good! And then, you show up to an event with your robot, and you realize that whenever you switch views in the UI, suddenly your entire comms link grinds to a halt.

So, what gives?

Discovery Traffic and Fragmentation

This issue is a classic case of discovery traffic trashing your link. Whenever the UI subscribes to a new topic in ROS, the Data Distribution Service (DDS) middleware needs to figure out which node in the network is publishing that topic, so it generates a ton of multicast UDP traffic to send out to the network. This works fine over a high-bandwidth Ethernet link, but causes problems with a bandwidth-constrained radio connection.

When our team hit this, we took some Wireshark captures before and after. Here's the normal, working state of the network:

And here's after we generate a bit of DDS discovery traffic:

The unassembled fragments! The horror...

If you're not familiar with fragmentation, it's something that happens at the network layer to break large packets into smaller ones to fit your particular network's maximum transmission unit (MTU), which is usually 1500. This disassembly and reassembly is typically transparent to the user, and is handled at the OS level. The fact that Wireshark is showing these unassembled fragments implies that they couldn't be reassembled correctly -- some of the fragments were dropped, and now the message can't be recovered.

In other words, this is not cute, computer networks only do this when they're in extreme distress.

Tunable Parameters

The ROS 2 documentation suggests a few knobs which you can turn to hopefully improve performance:

  • Reduce net.ipv4.ipfrag_time. By default, Linux allows 30 seconds for a packet to be reassembled before giving up and dropping the fragments from memory. If a lot of fragmentation is happening, allowing this much time can fill buffers quickly, and on a small network (on the scale of one robot and its base station), it shouldn't ever take more than a few seconds for a fragment to get from one end to the other. ROS recommends a value of 3 seconds.
  • Increase net.ipv4.ipfrag_high_thresh. This controls the amount of memory used for packet defragmentation. By default it's 256 KiB, but in a scenario with lots of fragmentation happening, you can bump this up as high as 128 MB.
  • Increase net.core.rmem_max. This controls the size of the buffer that the Linux kernel uses for receiving data on a socket. ROS suggests anywhere from 4 MiB to 2 GiB depending on the DDS vendor.

Alternative DDS Middleware

One thing to try is changing the actual DDS implementation you're using. DDS is an open standard, and there are plenty of implementations to choose from. The popular implementations all have corresponding ROS middlewares written, and it's trivial to switch from one to the other.

The default middlewares so far have been:

  • Ardent, Bouncy, Crystal, Dashing, Eloquent, Foxy: FastDDS by eProsima
  • Galactic: CycloneDDS by Eclipse
  • Humble, Jazzy: FastDDS by eProsima

In addition to FastDDS and CycloneDDS, there are ConnextDDS and GurumDDS, but those lack open-source licenses.

Interestingly, Cyclone is seemingly more loved by the community than FastDDS: The MoveIt inverse kinematics tool recommends Cyclone. For what it's worth, switching to Cyclone largely fixed our discovery traffic problems:

DDS Overhead

This slide deck by Charles Cross goes into detail about some techniques for optimizing ROS 2 traffic. One often-overlooked feature of networking in ROS 2 is the amount of overhead for each message.

Here's a packet containing a command that we sent to the left wheel of our robot. It's a single 32-bit floating point number, so the value itself takes up 4 bytes. How much overhead does transmitting this incur?

That's 138 bytes over the wire:

  • 14 bytes of Ethernet header
  • 20 bytes of IPv4 header
  • 8 bytes of UDP header
  • 96 bytes of RTSP data
    • 20 bytes of RTSP header
    • 12 bytes of INFO_TS submessage
    • 32 bytes of DATA submessage
      • 24 bits of header
      • 4 bytes of encapsulation kind/options
      • 4 bytes of data (this is our actual message)
    • 32 bytes of HEARTBEAT submessage

In other words, sending each message requires 132 bytes of overhead, regardless of the actual size of the message. With this in mind, we can greatly reduce the load over the network by combining as much as possible into as few topics as possible. In practice, here's how we applied this:

Custom Messages

Wherever possible, make use of ROS 2 custom messages. Each new system we build starts with a *_msgs package, which includes message definitions tailored to that stack.

There are lots of reasons to do this other than performance: it makes the development experience much better by giving meaningful names to values, it catches semantic "mismatches" caused by improperly hooking up a publisher and subscriber, and it helps make invalid states unrepresentable (can you tell I'm a Rust fan?).

For the above example, our team now uses a DriveCommand which encodes the speed of each of our six motors in a single message, each named semantically:

DriveCommand.msg
float32 left_front
float32 left_middle
float32 left_back
float32 right_front
float32 right_middle
float32 right_back

Large Messages and Fragmentation

Our team ran into another issue with ROS 2: heavy fragmentation with large message types.

While filming a demo video of our system, we wanted to include a visualization of point-cloud data from our stereoscopic camera (a Zed 2i by StereoLabs). Last year, we got the shot (video here):

We had no trouble doing this in ROS 1 -- as our communication link degraded, the point cloud data just started to slow down. However, in ROS 2, this demo completely broke down.

The key difference here is that ROS 1 used TCP, while ROS 2 uses UDP. Each point cloud message is big: our sensor is quite high-resolution and each pixel needs a color and position in space. With a TCP connection, congestion control takes care of things, slowing down the link and causing frames to be dropped to keep things working -- this slows down the framerate at the output but otherwise works fine.

In ROS 2 using a DDS middleware, messages are sent over UDP. However, remember that frames on most networks are limited to an MTU of 1500. When you send a large message using a DDS, the DDS layer just creates a single massive UDP packet, and the OS is responsible for fragmenting it appropriately. If the computer on the other end can defragment it properly, you're in business. In a congested network, the chances of successful reassembly fall to near-zero, meaning you'll get nothing out of the other end of the connection.

Here, you've got a few options:

  • Try to control the fragmentation process yourself: avoid sending large messages, instead send smaller ones.
  • If you're able to, try to reduce the traffic over the network link to pull it out of a congested state (and hope for the best).
  • Transport this data through a different transport mechanism. (We do this for image streaming.)

QoS parameters

ROS 2 introduced Quality-of-Service parameters on subscribers and publishers. A QoS "profile" includes a few parameters (listed here roughly from most to least important):

  • History (Queue Depth): The number of samples to keep in the queue. Lower numbers improve efficiency, but may lead to old messages being lost if the network drops for a second or so. A sensor readout can usually have a queue depth of 1 since the system only cares about the last read value, but a button press value should have a larger queue depth or it's possible to miss inputs.
  • Reliability: "Best Effort" or "Reliable." Controls whether publishers should retry transmission to subscribers if it fails.
  • Durability: Whether the publisher holds on to samples to give to subscribers who join after the publisher starts ("Transient Local") or not ("Volatile").
  • Deadline: The expected maximum amount of time between messages on a topic.
  • Lifespan: The amount of time until a message becomes stale. Messages which are published but not received until after their lifespan expires are dropped.
  • Liveliness: Whether publishers are implicitly considered alive if they publish to a topic ("Automatic") or if they must manually use the publisher API to assert liveliness ("Manual By Topic").
  • Lease Duration: The amount of time between assertions of liveliness (see above) before a node is considered to have lost liveliness.

While subscribers and publishers use the same profiles, the work in slightly different ways. A subscriber's profile is the minimum quality it's willing to accept, and the publisher's profile is the maximum quality it's willing to provide. This makes a bit more sense when you look at examples:

PublisherSubscriberCommunication
ReliableReliableReliable
ReliableBest EffortBest Effort
Best EffortBest EffortBest Effort
Best EffortReliable(doesn't work)

Zenoh

What if, however, you weren't at the whimsy of the DDS protocol at all? If you let go of the purism of the DDS standard and instead substituted a more efficient protocol which could better handle the congested, dynamic conditions of a lossy wireless network?

This is where Zenoh comes in. Zenoh is a pub/sub protocol designed with one goal: reduce network overhead as much as possible. Most importantly, it has plugins tailored at interoperating with existing DDS systems.

When working with ROS 2, Zenoh dramatically reduces the amount of discovery traffic generated (by up to 97%) compared to a traditional DDS middleware. We've also found it to be much better at communication in general even after the discovery state is finished. We're not the only ones: an Indy Autonomous Challenge team made the same observations.

So how is Zenoh some kind of miracle for communication? It's a well-designed protocol, but more importantly, it's not held down by the restrictions of the DDS standard. DDS is based on a protocol designed in 2003, and is designed foremost for interoperability, not efficiency. You can see this in the protocol's design: instead of defining a standard byte order, it's defined on each message based on a flags register; information about the DDS system vendor is sent with every packet; each node is required to maintain a full graph of the network; and entity IDs are duplicated between submessages in a packet instead of being sent once per packet.

ROS 2 was initially pitched as a DDS-centric version of ROS, replacing the bespoke TCPROS with DDS as an open, standard communication protocol. However, frustrations with the complexity of the DDS implementations and their poor performance on lossy wireless networks motivated OpenRobotics to search for an alternate middleware. Unsurprisingly, they chose Zenoh.

If you're reading this after May 23, 2024 (World Turtle Day), ROS 2 Jazzy Jalisco will be out, and hopefully rmw_zenoh with it.

In the meantime, we're stuck running a DDS on each host and using plugins to bridge the gaps. The best approach as of writing is to use zenoh-plugin-ros2dds, setting ROS_DOMAIN_ID to a different value on each host to prevent DDS communication over the network. Old tutorials used zenoh-plugin-dds, which works for ROS 2 applications but doesn't support ROS 2 tooling as well.

Zenoh ended up being the solution to our team's DDS dilemma. We plan to run Zenoh at the 2024 University Rover Challenge.

UDP vs TCP

When running over a network, Zenoh can run over either TCP or UDP. Typically, TCP is usually chosen as a network protocol for its reliable transmission guarantees, so it may seem like a poor fit for a system where some topics only need best-effort transmission. We've found, however, that our system tends to work much better over TCP than UDP, and we believe this is due to the TCP congestion control mechanism.

TCP uses a complex congestion control algorithm to avoid sending too much traffic over a congested network and causing undesirable behavior. With our radio network, the constrained link is specifically between our two IP radios, Each radio has a full 100 Mbps link to its computer (either the Jetson on the rover or the laptop in the base station), so the computers running ROS have no direct knowledge of network conditions. Then, when naive UDP protocols like DDS send way too much traffic over the network, it causes packets to be dropped on the radios, and the only feedback each host receives is a lack of acknowledgement after a timeout. On the other hand, TCP congestion control minimizes packet drops in the system, causing more reliable transmission.

ros2-sunburst

In true breq.dev tradition, this post leaves off with a tool I've written to help solve some tiny facet of this problem. In this case, it's a bandwidth usage visualization tool for ROS 2 traffic.

Here's a plot of our team's bandwidth usage by topic:

In ROS, topics tend to follow a hierarchy structure: your robot's drive stack might fall under /drive, commands might be under /drive/cmd_vel, etc. This structure isn't inherent to the transport protocol in any sense, but it does help you reason about the parts of your system which are using the most bandwidth.

Using this tool works as follows:

  • Capture a sample of DDS traffic over your network with Wireshark. (Ensure you include the initial discovery traffic -- otherwise there is no way to map DDS writer/reader IDs to ROS 2 topic names).
  • Run this script on your packet capture file.
  • View the output in the terminal (the top topic names by value) and in the plot (the starburst visualization).

You can grab the script from this gist.

]]>
https://breq.dev/2024/05/17/dds /2024/05/17/dds Fri, 17 May 2024 00:00:00 GMT
<![CDATA[Radios for Robots]]>

Radios are often an afterthought when designing a robotics project. Few robots are self-contained: almost all need to communicate with some external system, be that a basic remote control or a complex offboard system for advanced processing.

Most useful robots also need to operate in an uncontrolled environment. A Roomba needs to connect to a customer's Wi-Fi network, a self-driving car needs to operate reliably regardless of local RF conditions, and a photography drone can't fall out of the sky if an FPV racer sets up shop at the next field over.

My experience with this is primarily in the space of wheeled, "prototype-scale" robots of varying sizes. There is much to consider when scaling up a system beyond one or two prototypes, and the math tends to work out differently for flying 'bots compared to terrestrial ones. (And if what you're building is a boat, good luck -- RF and water do not mix well, and the Fresnel Zone is your enemy.)

Hardware

Radio hardware is built for a wide range of applications, meaning it's difficult to even enumerate the options available to you. Especially for a prototype or hobby project, expect to look in interesting places for what you need to make things work.

Consumer-Grade Networking

We've gotten quite good at building cheap radios to connect consumer devices over short distances, over Wi-Fi, Bluetooth, or other protocols. While these chips might have been designed for things like smart lightbulbs or wireless headphones, they're often a great choice for short-range networks and can often provide substantial throughput.

Your built-in Wi-Fi

Sometimes, the best radio is the one you already have. If your project uses a recent Raspberry Pi model, and your laptop has a Wi-Fi chip, then you're all set for short range communications!

With Wi-Fi, one device must be the access point and the rest are stations (clients). There are a few advantages of making your robot the access point:

  • You don't need to configure the robot with the details of your existing Wi-Fi network
  • You can operate the robot in areas without an existing network
  • You have full control over the channel on which the network operates
  • It's easy to connect more devices (e.g., encouraging nontechnical users to connect their phones to try operating your creation!)

This is a great option for running your robot at events where you won't know the setup beforehand, provided you don't expect too many other robots taking up their slice of the 2.4 GHz spectrum.

But it does come with a major limitation: It is generally quite difficult to create a setup in which your robot, as the access point, is able to access the internet via one of its clients. If your laptop is a station on your robot's network, then whoops, it can't be on your home network anymore. Even if you plug your laptop into an Ethernet connection with internet access, good luck getting your laptop to route traffic from your robot to the Internet and back. It can be done, but it takes some complicated network configuration that isn't fun to deal with.

Bluetooth

Bluetooth is great as a "it just works" system, requiring just the initial pairing flow before you're off to the races. I used it to send commands to the LED choker that I made. You can get up and running quickly with a microprocessor with Bluetooth support (like the nRF52480) or a drop-in Bluetooth Serial module (like the HC-05 available from tons of sketchy Amazon sellers).

Bluetooth modules are typically much lower power than their Wi-Fi counterparts, don't usually work in a scenario more complex than a point-to-point connection, and have substantially worse throughput. Different Bluetooth profiles support sending different amounts of throughput through, but something like the Serial Port Profile will struggle to transmit images or video.

Bluetooth also doesn't typically use the TCP/IP stack (although it can be done with a specific profile). This is typically fine for microcontroller-based robots, but can make development annoying if your robot runs Linux.

External Wi-Fi routers and cards

Still within the consumer realm, some off-the-shelf Wi-Fi routers and cards can be great for getting a bit more performance or range out of a Wi-Fi based system.

Consider a Wi-Fi router on your robot as an extension to the "robot as access point" idea in the prior sections. You get a physical piece of hardware with a sole purpose of providing a Wi-Fi network, along with Ethernet ports to connect any computers onboard your robot. Wi-Fi routers typically use higher power, have better antennas, and can provide more throughput than (ab)using your device's built-in radio.

Alternatively, you can put the router at a base station and connect it (e.g. via Ethernet) to give it Internet access. This gives you the benefits of a wireless network entirely under your control while still giving your robot and ground station Internet access for development.

On the station end, you can still improve compared to whatever built-in adapter you have. Some UAV enthusiasts working on the OpenHD project have identified the Realtek RTL8812AU and friends to be a strong contender, having the maximum allowed 500mW power output, two antennas for diversity and MIMO operation, and support for both the 2.4 GHz and 5.8 GHz bands (which we'll discuss more in the Band Planning section).

ESP32 chips

Back in my day, all we had was the ESP8266, a cheap board with no GPIO, bad hardware support, and acceptable 2.4 GHz Wi-Fi in a very cheap package. Nowadays, Espressif Systems has unveiled a line of ESP32 boards with ample GPIO, fast clock speeds, and both Wi-Fi and Bluetooth capabilities.

Choosing an ESP32-based design gives you lots of options for small robots, either as the robot's primary microcontroller or as a peripheral. For instance, you can use Bluetooth to set up the initial connection, then let the user input their Wi-Fi details and switch to that connection. Or, you can use Bluetooth for telemetry and controls, while keeping a Wi-Fi link during development for stronger visibility.

Of course, this series of chips is in the microcontroller realm, so they won't be able to push through as much throughput as a dedicated Wi-Fi router or card. for small and short-range projects, however, they can be a strong contender.

Drones

The world of UAVs has brought forth many unique solutions for radio communication. Unlike wheeled robots, you can typically assume that a drone will maintain line-of-sight with the control station at all times, giving signal penetration a lower priority. Drones also are extremely sensitive to weight, so solutions tend to be compact and simple.

One quirk of drone communications is that drones typically use two frequencies: one for control and basic telemetry, and another for FPV/video streaming, leading to two very different sets of solutions.

For controls, the RFD900 is by far the most well-known choice. It provides a basic serial connection which typically facilitates the transfer of Mavlink messages. Unlike any of the other solutions we've looked at thus far, the RFD900 works on the 900 MHz band.

The RFD900 uses frequency-hopping spread spectrum: the specific frequency used will jump around in the 900 MHz band to avoid interference. This is great in a scenario with other RFD900s (like an FPV drone racing event) since they'll generally stay out of each other's way.

FPV drones also typically use analog video systems. There are a few reasons why these have stuck around, even though they use spectrum far less efficiently than digital systems:

  • Since digital systems need to perform video encoding, analog video requires much less hardware, saving weight
  • Analog video tends to degrade gracefully under interference or poor signal conditions
  • Since drones typically operate under line-of-sight conditions, they can use higher bands (most commonly, 5.8 GHz) which allow for wider channels

Because of this, analog video gear is generally commonly available. If you're working with a small robot with only a single camera over a relatively short distance, this can be a good solution. However, note that you can't perform any onboard processing with such a camera.

WISP Gear

Wireless Internet Service Providers, or WISPs, are companies that provide Internet access to customers using wireless links as opposed to wired (fiber or copper) connections. Equipment made for WISPs is built to be reliable, designed for long-term operation, highly configurable, and to operate right up to the power limits allowed by law.

Ubquiti products are cheap and work well, including the Rocket and Bullet series of radios. Each line comes in 2 GHz and 5 GHz variants, with 900 MHz variants discontinued but often found on eBay.

For stronger 900 MHz capabilities, consider the Cambium PTP 450 900. Note that these use a proprietary protocol. While Cambium makes some other variants, they're either difficult to get licenses for (like the PTP 450 on 3.65 GHz) or prohibitively expensive for most use cases (like the PMP line, which supports point-to-multipoint scenarios across many band types).

These radios provide an Ethernet port and can effectively make your network look like a flat Ethernet subnet, greatly simplifying your configuration. Alternatively, many Ubiquiti radios can be configured to operate as a standards-compliant Wi-Fi hotspot instead of in point-to-point mode, allowing you to combine a high-powered WISP radio with inexpensive consumer Wi-Fi gear.

Packet Radio

A promising method I've been looking for an excuse to use is low-cost packet radio modules. These implement a barebones protocol without any sort of pairing, connection, or scanning logic. They typically support relatively slow data rates, but can cover a relatively large area.

These radios tend to transmit in one of two bands:

  • 433 MHz: Generally unlicensed in Europe, licensed in the Americas (although usable with an Amateur Radio license).
  • 900 MHz: Unlicensed in both Europe and the Americas (albeit slightly different bands, 868 vs 915 depending on region). This can be a blessing (it's easier to get started with) or a curse (in the US/Canada, there are many more devices in this band than in 433).

An iteration of these uses LoRa, a modulation technique that enables much farther communication (although this increases cost).

Amateur Hardware

Another source of radios is hardware designed for amateur radio use. These operate on a huge range of bands, but typically focus on voice transmissions. The digital modes that do exist are intended for short, infrequent messages. This is typically not a good fit for a robotics scenario, but it could be useful.

On the absolute low end, you can always use a cheap Baofeng radio (usually ~$20 USD on Amazon) with your computer's sound card and some software for a basic digital mode.

Of course, amateur hardware also has restrictions (e.g., no commercial use) which we'll talk about more when we discuss amateur licensing.

Band Planning

When picking a radio, you'll often want to consider the frequency band it transmits on. Specific bands have certain innate characteristics which make them more or less suited to a given application. You'll also need to consider other radio systems within your robot and ensure you do not cause interference.

Different bands have different strengths and weaknesses. Generally, lower frequency bands will provide better signal penetration and non-line-of-sight operation (e.g., operating your robot from the other side of a wall), while higher frequencies are better suited to line-of-sight or short-range uses.

An advantage of higher-frequency bands is that they generally can provide much higher bandwidth, both via higher signal rates and wider channels. If your system involves streaming lots of high-resolution video, point clouds, etc., you will probably struggle on a lower-frequency setup.

If your robot has multiple communication systems, you'll need to make sure you aren't causing interference to yourself. This could take the form of multiple links to your base station (such as separate links for telemetry and video), or it could arise in a system with multiple interacting robots.

Generally, for multiple radio links to your base station, you will want to put each link on a separate band to minimize interference. For instance, many drone pilots use 900 MHz or 2.4 GHz for teleoperation and 5.8 GHz for video streaming.

Interference can also be tricky to identify the source of. On the Rover team, we noticed an issue with our 900 MHz radio gear interfering with our high-precision GPS receivers, preventing us from attaining a precise fix.

Your robot might also need to operate in an area with lots of interference, such as a Maker Faire, in which the sheer number of users on the 2.4 GHz and 5 GHz bands can make protocols like Wi-Fi almost unusable. While frequency-hopping protocols can work more reliably, they still have their limitations.

Amateur Licensing

If you're a hobbyist, you aren't just limited to the ISM or unlicensed bands: the Amateur Radio Service is there for you! Getting an amateur license gives you access to a much larger array of bands across a wide range of radio spectrum. While this varies by country, most countries do something similar to the U.S.:

Band chart image created by the ARRL

The most important restrictions to keep in mind are:

  • Your use-case must be for a non-commercial purpose.
  • You generally cannot use encryption (although there's some debate on this). You definitely can't use encryption to hide the content of your messages.
  • Any data protocol you use must be publicly documented.
  • You need to obtain an amateur license, and you must identify with your callsign once every 10 minutes during transmission.

Getting licensed is generally straightforward (Becky Stern's article gives a good overview of the process). A few tips from my own journey:

  • If all you care about is passing the test, just go on hamstudy.org and memorize the questions for a few hours. There are resources that you can pay for to learn the concepts in more detail, but it's not necessary.
  • You'll initially be assigned a random callsign, but you can apply for a vanity call immediately afterwards for a small fee.
  • If you're a college student, see if there's a club on campus that runs exams!

In the U.S., there are a few classes of license, and most countries follow a similar structure. At time of writing, I hold a technician class license, and it gives me almost all of the same privileges as the higher classes on the shorter bands (433 MHz and up).

Some amateur bands overlap with unlicensed/ISM bands: the 33 centimeter amateur band takes the same space as the 915 MHz ISM band, the 2.4 GHz amateur band partially overlaps the ISM one, and the 5.8 GHz amateur band is a subset of the spectrum set aside for Wi-Fi. In these bands, you can often use commercial grade hardware at higher power limits or with larger antennas -- for instance, running a Ubiquiti access point on 2412 MHz at maximum power with a high-gain antenna. This gear generally isn't designed for use above the unlicensed limits, so you won't always gain a significant boost, but it can still give you a bit more range.

Other bands are generally pretty quiet except for amateur users:

  • The 2 meter and 70 centimeter bands are predominantly used for analog FM voice, although the 70 cm band overlaps with the unlicensed 433 MHz band in Europe (meaning you can find consumer hardware which operates on it)
  • The 23 centimeter (1240 MHz) band is generally pretty empty, although it is right next to the frequencies used by GNSS (e.g., GPS) receivers which can cause problems, and it's tough to find hardware for

Conclusion

Based on my experience so far, I've developed a few rules for selecting which system to use:

  • Start with Wi-Fi. It covers a surprising number of use cases regarding range and data throughput, and hardware is common, cheap, and lightweight.
  • Punch through obstacles using either higher power or larger wavelength. If you're going a significant distance through a forest, for instance, you'll probably struggle with consumer Wi-Fi modules. You could switch to a larger wavelength like 900 MHz, which can better penetrate obstacles, or you can stay on 2.4 GHz but switch to following amateur regulations and using WISP gear and high-gain antennas.
  • Consider more specialized systems if needed. You might struggle to find components to run on other bands or using different protocols, but it's sometimes the only way to satisfy an unusual use case.
]]>
https://breq.dev/2024/04/16/radios /2024/04/16/radios Tue, 16 Apr 2024 00:00:00 GMT
<![CDATA[88x31 Buttons and Network Science]]>

Background

You might've noticed the tiny buttons/badges at the footer of this site and others, each showing a pixel-art design and linking to another website. These are called 88x31 buttons, and they're a de-facto standard of the indie web. Sources on this are few and far between, but The Neonauticon identified the start to be Netscape, which published "Netscape Now" buttons that sites could add to show their use for then-new web features.

People took this idea and ran with it, placing 88x31 buttons for themselves, their friends, and projects they supported in their websites, forum signatures, etc. Similar to webrings, 88x31 buttons provided a fun way for people to find related content to a given page.

My friends and I are working on mapping the entire 88x31 graph. The result of our work is the first ever snapshot of the entire 88x31 network, available in a single JSON file.

Relatedly, a few semesters ago, I took a course on network science. One of the main takeaways was that lots of real-world networks exhibit something called the small-world property, specifically:

  • High clustering (you are more likely to be friends with your friends' friends)
  • Low distance (the typical distance between two randomly-chosen nodes stays small as the graph grows)

We see this phenomenon across all sorts of network types: the original 1998 Watts & Strogatz paper uses networks of film actors, the electrical grid, and the neural network of a worm.

So: we've got a brand new dataset and some criteria to test. Does the 88x31 graph exhibit the small-worlds property?

Analysis

To analyze the behavior of the graph as it expands, I'll be using two datasets: one with 4428 nodes that we obtained partway through the scraping work, and the most recent dataset with 16023 nodes. Analysis was done using Gephi.

Clustering

Clustering is the measure of how locally-connected a graph is. Think about a social graph: if everyone had a set of, say, 20 friends randomly chosen from a group of 1 million, you'd have quite low clustering. However, typically, you're much more likely to be friends with your friends' friends, implying high clustering. Clustering is quantified by the clustering coefficient.

The formal definition of a graph is a set of vertices (nodes) VV and a set of edges (links) EE, such that:

G=(V,E)G = (V, E)

We write that edge eije_{ij} connects vertex viv_i with vertex vjv_j.

To define clustering, it's useful to define the neighborhood of a vertex NiN_i as the set of all of its connected neighbors, considering both links "in" and links "out":

Ni={vj:eijEejiE}N_i = \{v_j : e_{ij} \in E \vee e_{ji} \in E \}

Let's go a step further and define the degree of a vertex kik_i as the number of connected neighbors it has, i.e.,

ki=Nik_i = |N_i|

where we use the vertical bars to mean the cardinality (number of items contained within) the set of neighbors.

How many links could exist within the neighborhood of a node? Let's ignore self-loops, so the number of possible links is ki(ki1)k_i \cdot (k_i - 1) -- each node can connect to each of the other nodes.

The number of links actually in the neighborhood can be written as:

{ejk:vj,vkNi,ejkE}\{ e_{jk} : v_j, v_k \in N_i, e_{jk} \in E \}

And thus, the clustering coefficient for a given node is:

Ci={ejk:vj,vkNi,ejkE}ki(ki1)C_i = \frac{|\{ e_{jk} : v_j, v_k \in N_i, e_{jk} \in E \}|}{k_i(k_i-1)}

We can use software to compute the clustering for each node, and take the average across all of them. The average clustering coefficient is:

NodesClustering Coefficient
44280.122
160230.123

Is this value high enough to indicate high clustering? For comparison, let's construct two random graphs with the same number of nodes and edges as each of our sample datasets: one with 4428 nodes and 17254 edges, and one with 16023 nodes and 57202 edges. Gephi supports the Erdős–Rényi model, which generates a graph based on the number of nodes and the probability of an edge between any two given nodes. Based on the real graph, we can determine the probability numbers as:

1725444284427=0.0008802\frac{17254}{4428 \cdot 4427} = 0.0008802
57202160232=0.0002228\frac{57202}{16023^2} = 0.0002228

I needed to multiply these by 2 to get them to load into Gephi properly; perhaps there is some correction going on for undirected vs directed graphs.

Finally, we can get the average clustering coefficient for each of these graphs:

NodesClustering (88x31)Clustering (Random)
44280.1220.001
160230.1230.000

Note that the precision of these values is limited by the output of Gephi to 3 decimal places -- those values really are that low!

Our random graph has low (sometimes called vanishing) clustering as the number of nodes increases. This is the typical behavior for a random graph to have -- as graphs grow, the likelihood of a given node being connected to another node decreases, regardless if those nodes are linked by an intermediate node.

In contrast, our 88x31 graph has high (or nonvanishing) clustering even as the number of nodes increases. This is a notable property in and of itself, and brings us halfway to demonstrating the small-worlds property!

Distance

The average path length LL of a graph is the average of the length of the shortest path between each pair of nodes, ignoring node pairs with no connecting path.

We can compute the average path length of our graph to be:

Nodes (NN)log(N)\log(N)Avg. Path Length LL
44288.3967.182
160239.68111.269

For the network to exhibit the small-worlds property, we need the average path length to be proportional to the logarithm of the number of nodes -- that's why I've added it to the table above.

Between graphs, the average path length increased by a factor of 1.571.57, while the number of nodes increased by a factor of 3.623.62 (and its logarithm increased by 1.151.15). This isn't exactly on the expected value for LlogNL \propto \log N (where \propto means "proportional to"), but it's also quite clearly less than the amount that NN increased, meaning LNL \ll N. Personally, I'm comfortable calling this bound satisfied.

A few reasons come to mind as to why we might not be getting the expected value here:

  • NN is small enough that this is just noise in the dataset. 16K nodes isn't that big.
  • The topology of the two 88x31 graphs is different. The first graph consists of the nodes discovered off of notnite.com in a few hours, while the second graph is the entire network as identified by our scrapers. The first graph is going to skew towards the English-speaking, indie web, and LGBT communities, while the second graph includes more Italian and Brazilian communities, tabletop RPG forums, and other groups which are broadly separate from the cluster in which the scraper started. Qualitatively, these other groups seem less interconnected compared to the initial dataset.
  • The second graph has a slightly different scraper implementation which analyses the HTML statically instead of using a webdriver to load the page, potentially missing more links on sites which use client-side rendering techniques.

Hubs

Real-world graphs often have hubs. In a transportation network, hubs are points with connections to lots of other places; in the Internet, hubs are major ASNs like Hurricane Electric; in a social graph, hubs are just people who know a lot of other people.

We can see if our 88x31 graph exhibits this same behavior by looking at the degree distribution of the graph. For this, I generated the degree of each node in the full 88x31 graph in Gephi, then exported the table into Python to take a histogram and then into Google Sheets to make charts.

Above is a log-log plot of the degree distribution of the full 88x31 graph. Let's walk through it, since it's not the easiest thing to digest.

Each x-value represents a degree which a node could have, and the corresponding y-value is the number of nodes with that degree. Per the chart, there are about 10,000 nodes with degree 1, but closer to 500 with degree 10, and only a handful with degree 100.

The "clumping" taking place at the bottom is since we're working with a discrete number of nodes. For each of the higher degrees, we'll probably only see only 1 or 2 nodes with that degree, and whether it's 1 node or 2 nodes is up to random chance. Since there's nothing in between, we see those two lines at the bottom of the graph, and we can notice quite a bit more noise there as well. Similarly, towards the top-left of the graph, we're running into quantization there as the degree of a node must be a whole number.

So what are we looking for here? For reasons which will become clear, let's bring back our random graph from earlier and run the same procedure on it:

That looks very different -- sort of like a bell curve around some single most-popular degree value, giving the network a sense of "scale" or "size." The 88x31 graph, on the other hand, is scale-free in that hubs in the network get larger as the network grows.

Scale-free networks are more rigorously defined as a network in which the degree distribution follows a power law. That blue line in the graph is the power law function that best fits the data. Especially compared to the random graph, it's pretty clear that our 88x31 graph meets the definition of scale-free.

What are these hubs in our network? In our case, they're mostly sites which try to collect as many 88x31 buttons as possible into a single site, like the neonaut collection (which currently holds the #1 spot).

Conclusions

In this post, we looked at an entirely new network dataset through the lens of network science, discovering that it fits the same criteria that characterize social, communication, transport, and biological networks in our world.

Network science is a field that interests me greatly, but that doesn't often come up much in my work, so I'm grateful for the chance to work on this project. A lot of key insights about real-world networks from the field come up in unexpected ways.

We'll continue to track the 88x31 network over the forseeable future. Maybe we'll discover an unmapped part of the network and end up with even more data to sort through. If you'd like to follow the technical effort, feel free to watch or engage via the project's GitHub repository.

Of course, you can join the network as well, and it's as straightforward as it sounds: find a friend with 88x31s on their website, make up your own little 88 pixel by 31 pixel image in your favorite bitmap image editor, and get them to add it to their site along with a link to your website! A few tips:

  • GIMP is a great editor to get started with -- just crank the zoom up to 800% or so depending on your screen size.
  • Save as a PNG (or GIF for animated 88x31s) to prevent lossy compression from messing with your pixel art.
  • When adding a button to your website, use the CSS image-rendering: pixelated property to keep your pixel-art goodness from being blurred on high-DPI displays or when zoomed in.
  • Look for inspiration in the buttons made by your friends, and don't be afraid to experiment!

Hope to see you on eightyeightthirty.one soon!

]]>
https://breq.dev/2023/12/26/88x31-science /2023/12/26/88x31-science Tue, 26 Dec 2023 00:00:00 GMT
<![CDATA[eightyeightthirty.one]]>

The entire mapped network of 16,000+ pages, as of 2023-12-26.

This project was a joint effort by myself and a few friends:

Overview

88x31 buttons are everywhere on the indie web -- they're those tiny buttons on the homepage or footer of sites like mine which link to friends, projects, etc. They've been around for decades and have spread all over webpages and forum signatures. However, until now, there has been no way to view the entire network of 88x31 links all at once.

My friends and I have implemented a scraper which can crawl a page for 88x31 links, a server to manage the queue of sites to crawl, and a web frontend to visualize the graph as a whole.

Motivation

NotNite provided the initial idea and implementation, and after adryd sent me a proof of concept, the three of us and some friends immediately hopped on a call to work out the details.

Technical Description

The implementation consists of three parts: the scraper, the orchestration server, and the frontend.

Scraper

The scraper receives a URL and is responsible for visiting the webpage to look for 88x31s which link to pages. We experimented with a webdriver-based solution (using Puppeteer), but ended up switching to static HTML parsing for performance, making the tradeoff that the scraper can't read client-side-rendered pages.

The scraper is also responsible for noting any redirects that happen, and for trying to identify canonical URLs for pages, just as a traditional search engine crawler would need to.

The scraper is written in Rust and keeps no state, so it can be scaled up horizontally as needed.

Orchestration Server

To coordinate the scrapers and provide access control, an orchestration server is used. This server also accepts URLs into the network (adding them to the queue) and produces the graph file.

The orchestration server is backed by a Redis database -- this was chosen to ensure the queue was stored in memory.

The orchestration server also handles inserting found links into the queue, and correcting records after a redirect is found -- both complex processes to handle edge cases that arise when webmasters change things about their site.

Frontend

The frontend is used to display the graph to the user and allow them to navigate it efficiently. It has some functionality for zooming to a particular node and highlighting its links, and it can show the badge images used to link from one site to another.

We struggled to choose an effective graph library implementation which could render this many nodes on the screen, but eventually settled on Cosmograph as it blew everything else out of the water in terms of performance. The end result still takes time to load the initial graph but feels relatively snappy to navigate even on mobile devices.

Results

The result is a map of over 16,000 pages, either linking to or linked from another using 88x31s.

One thing that we didn't expect to happen was webmasters noticing our user-agent in their access logs and inquiring about our work. Apparently folks aren't used to scrapers which specifically target 88x31 images! It was very cool to get to explain this project to those who asked about it, and we made sure to clearly identify ourselves and provide opt-out instructions in case any operators disagreed with our mission.

I also got to finally use everything I learned in a network science course last year to analyze our resulting data, which was quite fun. TL;DR: our dataset of 88x31 links is similar to other social networks which have been studied in the field!

]]>
https://breq.dev/projects/eightyeightthirtyone /projects/eightyeightthirtyone Tue, 26 Dec 2023 00:00:00 GMT
<![CDATA[In Defense of Really Long Merge Requests]]>

Disclaimer: This article may or may not be an application of Cunningham's Law. I'd love to hear about how other teams work and why people like (or dislike!) their current way of doing things.

I've maintained an (albeit small) open-source library, led software development on a robotics team of >40 people, and worked on a professional software team at co-op. Through this, I've gotten to experience a range of code review styles and merge request etiquette. A common piece of wisdom I've seen tends to be: "keep your merge requests small, so they are easily reviewable." Conventional wisdom states that longer merge requests are more difficult to review -- obviously this is true, since there are more changes to look through. I'd like to argue, however, that in the long run, writing longer merge requests can save time and effort for a team.

Breaking features down is burning developer effort

Suppose you've set out to implement a feature, and you've ended up with a much larger implementation than you expected. Should you split that implementation into smaller merge requests to make life easier for your reviewers?

What makes breaking up merge requests tricky is that you'll need to verify that the state of the repository in between your two merge requests is valid. You need to create an intermediate version, then test that intermediate version. Semantically, this "in-between" version might make no sense, since it contains a partially-completed feature.

If you're using a tool like Rust's clippy to analyze your code, the intermediate version might need to be littered with #[allow(dead_code)] or your language's equivalent in order to pass CI. This clutters the commit history and provides no real value. And what if you accidentally leave in one of those #[allow(dead_code)] functions later?

Splitting a merge request into more than two parts makes this even more complicated. In many cases, it just isn't practical to break a single feature into multiple semantically reasonable and CI-passing intermediate steps.

More frequent code reviews means more "round trips" and greatly reduced speed

When you submit a merge request, you're essentially handing things off to your reviewer. A good reviewer will get back to you in 24 hours, but someone who's busy with other work might take even longer. In this intermediate time, what do you do? In most cases, you'll end up switching to another project, losing steam on the feature you were working on. Only when they get back to you can you resume work on the feature. The more of these "round trips" you have to deal with, the longer it will take you to implement useful features.

The good news is that there are ways to gather feedback from others in the middle of implementing a feature that don't slow you down! You can set up a 1-on-1 to talk through design decisions, exchange UML drawings, pair program, or just Slack your teammate a link to your branch. They can then respond at their own pace, without blocking you from doing your work.

If merge requests queue up, addressing concerns becomes nontrivial

If you're impatient, you might not wait for one of your merge requests to be approved before starting work on the next. After all, they're all part of one feature, so it makes sense to tackle them sequentially. So you start your feature-2 branch before your feature-1 branch is merged in.

In order to pull this off, however, you'll need to branch feature-2 off of feature-1, since the changes in feature-1 aren't in main yet. This works fine for now. And if your code is good, it's not a problem, since you'll just merge feature-1 into main first, then merge feature-2 in after.

This falls apart as soon as your reviewer points out an issue with feature-1. You switch to that branch and commit your fixes. Then, while waiting to hear back from your reviewer, you go back to developing on feature-2. But feature-2 doesn't have those changes, meaning you'll have to either rebase it onto feature-1 (which is annoying if any teammates also checked out that branch) or merge feature-1's most recent commit into it (which clutters up history). Then your reviewer requests more changes, and you need to do the song and dance once again. You probably can't just merge in all of your changes into feature-2 once feature-1 gets merged, since they're so tightly coupled.

This gets quadratically worse if you have a longer chain of merge requests. I've seen situations like this up to four layers deep, where each change to the oldest MR require rebasing every other MR in sequence. This is an awful experience for everyone involved, and it leads to comments on merge requests like "Fixed this in [later merge request], please approve this one now," which is definitely not the way things should work.

Reviewers need to consider a certain amount of context regardless

When you're reviewing code changes, you aren't just thinking about the code itself. You need to consider everything else in the system that the code interacts with.

Suppose you're incrementally replacing a control stack for a device written in ROS. Conceptually, the new and old implementations will be quite different.

You could go about this by rewriting each node in turn, each as its own merge request. You'll need to plan an order to migrate nodes in that allows you to reach the "target" system architecture, but you'll also need to not break the existing system along the way, since the system should be functional at each point in between the merge requests. If you want to rename certain topics shared between nodes, or change the message type passed between nodes, you'll need to carefully plan when you do that as well.

Then, your reviewers will need to consider each change in the context of both the existing system and the future/new system. Again, you can't break things in between merge requests, but you also want to make sure you're architecting things properly for the final system.

Or, you can just replace the old system with a new system in a single, atomic merge request, allowing your reviewers to focus exclusively on reviewing the new system.

For a different example: consider implementing a program that bridges a serial, I2C, or SPI connection and a Redis pub/sub channel, for instance forwarding messages from serial into Redis and vice versa. Suppose you want your program to handle several data types and have a consistent interface on each hardware interface.

You could break this feature into separate merge requests for each interface. Then, when your reviewers went to review each merge request, they would need to understand the management of the Redis connection, the datatype system used throughout the codebase, the differences in protocol between each of the planned hardware interfaces, and the handling of the specific hardware interface actually implemented in the MR.

This isn't too bad if these merge requests are reviewed one after the other, but what if a week or more goes by between each of them? Your reviewer will need to remember each of these things, every time, ultimately leading to spending more time on review compared to just reviewing everything all at once.

Long diffs should take longer to review

There's a common refrain that "you can't possibly catch everything in a patch of more than X lines."

Consider this: would you rather read one chapter of a book every week or two, or spend a few days reading as much as you can? Which would allow you to better follow the story? Keep track of the characters? Understand the book at a deeper level?

I see no fundamental reason why a reviewer would have a lower success rate at spotting issues per line of code for a larger diff as opposed to a smaller one. It will require more time to read and understand, and it will require a reviewer to potentially be more deliberate in their review, but it ultimately leads to the same quality assurance with substantially reduced effort for the development team.

]]>
https://breq.dev/2023/07/27/long-diffs /2023/07/27/long-diffs Thu, 27 Jul 2023 00:00:00 GMT
<![CDATA[So, You Want To Stream Lots Of USB Cameras At Once]]>

Let's say you're building a live camera feed for something. Maybe it's a 3D printer, or an RC car, or a Mars Rover analogue (website hasn't been updated in years). Here's a pretty typical requirements list:

  • Video feed sent over an IP network
  • <500ms latency
  • Reasonably efficient encoding (e.g. H.264)
  • 640x480 resolution, 30 fps
  • Recover from network failures

With one stream, you're in pretty good shape! Just plug in any off-the-shelf USB camera, run a quick GStreamer command on both ends, and boom, you're good to go. But as you add more cameras into the mix, you'll run into some hiccups at seemingly every point in the pipeline.

This blog post is the culmination of 2 years of work into camera streaming pipelines on NU ROVER, Northeastern University's team for the University Rover Challenge. Over that time, we've evolved from an unreliable six-camera setup requiring four separate devices to an efficient, easy-to-use, reliable system encompassing 14 cameras on a single computer. This post will summarize the experimentation we did, the decisions we made, and the lessons we learned, starting with the individual camera modules themselves and working up through the pipeline all the way to how the feed is displayed.

Cameras

A camera module connects some image sensor to some interface. The choice of image sensor doesn't impact much; NU ROVER chooses camera modules based purely on FOV and resolution. The interface, however, greatly affects what hardware you can use for streaming.

CSI

Camera Serial Interface (CSI), sometimes called "MIPI," is the most barebones protocol for connecting a camera: the protocol is specifically designed for cameras and connected directly to the CPU on boards like the Raspberry Pi. A good example of this type is the Raspberry Pi Camera, but other manufacturers supply various cameras with different FOVs and sensors.

Pros:

  • Direct connection with processor, very low overhead

Cons:

  • Limited to number of CSI lanes provided by platform (e.g. 1 for Raspberry Pi, 2 for Jetson Nano, 6 for Jetson Orin)
  • Spec limits cable length to 30cm (although it usually works up to 2m)
  • Requires software tailored to the platform (e.g. raspivid), which may complicate integration with streaming software
  • More fragile cables than USB

In the past, we used multiple devices (2 Raspberry Pis, a Jetson Nano, and a Jetson TX2) to connect many CSI cameras into our system. This dramatically increased complexity and required installation of a network switch, taking up precious space inside the rover. Our new system does not make use of CSI cameras.

IP

IP cameras are often used as part of home security systems, making them readily available. They directly connect using Ethernet, removing the need for a separate device for encoding. Our team purchased a few, and I spent some time testing them, but I ruled them out pretty quickly. They're physically much harder to mount and integrate into a system, and since they're typically designed for recording and not live viewing, they often have very high latency.

Pros:

  • Minimal configuration needed
  • Designed to scale up to >10 camera systems
  • Durable and weatherproof

Cons:

  • Cameras are quite bulky and thus harder to find mounting points for
  • Cables are much larger and difficult to route
  • Requires using a (potentially very large) network switch to connect multiple
  • Provide limited or no encoder configuration (e.g. no way to adjust stream bitrate)
  • High latency (~1000-2000ms)

If you're working on a physically large, stationary, or spread-out system, IP cameras might be worth considering. Otherwise, the added bulk is a substantial drawback.

Analog Video

This year, our team partnered up with NUAV, Northeastern's unmanned aerial vehicles organization, to utilize a drone in part of the autonomous challenge. Their camera feed (used only during emergency teleoperation) was a 5.8 GHz off-the-shelf system common in FPV (first-person view) drone operation. These systems use an analog video signal to send video to the operator's goggles. While analog video is tempting as a self-contained system, it doesn't scale effectively to a multi-camera setup.

Pros:

  • Standalone system
  • Low latency
  • Graceful signal degradation if interference exists

Cons:

  • Requires a dedicated antenna, receiver, and display to view each feed
  • Each video feed requires its own hardware
  • Almost always fixed to 5.8 GHz (shorter range compared to 2.4 GHz WiFi or 900 MHz IP radios)
  • Limited number of channels/cameras (8)

We considered installing an analog video camera to use if our primary network link dropped out, but given the characteristics of the 5.8 GHz band, there are few situations in which this would be helpful. (If we can get a 5.8 GHz signal to the rover, we can almost definitely get a 2.4 GHz signal to it as well.)

USB

This brings us to USB cameras, arguably the most common type in consumer use today. These seem like an obvious choice, but there are a few complications when you scale beyond one or two cameras (which we'll get into).

Pros:

  • Cable and connector are durable but not too bulky
  • Standard software support across platforms
  • Small modules available with standardized mounting holes from Arducam, ELP, etc.
  • Unlimited number of cameras on each computer (we'll get to this...)

Cons:

  • Some overhead compared with CSI

It's no surprise USB was our first choice when we began developing the system. Even if you rely on CSI cameras for some systems, you're unlikely to be able to build a low-cost and practical system that incorporates 10 or more of them, meaning you'd have to mix in some USB cameras as well.

USB Bandwidth

We're all used to just plugging in USB devices arbitrarily and having everything work. Cameras, however, tend to push the USB 2.0 connection to its limit, and using several of them on one computer can be surprisingly difficult.

A bit of bandwidth math

Consider the bandwidth requirement of a typical video stream: 640x480 pixels, 30 fps.

Quick note: If you've worked with images, you might assume we need 8 bits per channel to represent each pixel in the frame. This isn't quite true -- we can save a little extra space with some smart encoding. Most raw video from USB cameras is in "YUYV" format, and contains three values: Y (luminance, or "brightness"), U (color), and V (also color). The eye is more sensitive to small variations in luminance than to similar variations in color, so we can save space by using 8 bits for luminance and 4 bits for each of the color signals, giving a total of 16 bits per pixel per frame.

The requirement for a single frame is thus:

640×480×16=4915200 bits640 \times 480 \times 16 = 4915200 \text{ bits}

And the requirement for a single second of video is:

4915200×30=147456000 bit/s147 Mbit/s18.4 MByte/s4915200 \times 30 = 147456000 \text{ bit/s} \approx 147 \text{ Mbit/s} \approx 18.4 \text{ MByte/s}

The theoretical maximum speed of USB 2.0 is 480 Mbit/s, and subtracting signal overhead, we're only left with 53.2 MByte/s. For a single camera, this is fine! We've got plenty of bandwidth left over.

A diagram showing a camera connected to a USB controller, showing only using a small portion of the total link capacity

What happens when we add multiple cameras? It depends. Let's first consider the case that the cameras are on different USB controllers from each other -- we'll go over what this means in a second. This is also fine.

Three parallel copies of the prior diagram

Now what if we try to put multiple cameras on a single USB hub? Here's where we start to run into problems.

A diagram showing three cameras connected to one USB controller through a hub, where the link between the hub and the controller is oversaturated

In a setup like this, two of the cameras will work fine, but the third will not work. Specifically, you'll be able to open all three device files simultaneously, but you'll only receive video frames from two of them. The third camera will not raise an error, which can make troubleshooting annoying.

If this seems weird for a reasonable USB setup to fail like this, that's because it is. Most devices don't come anywhere near to reaching the USB bandwidth limit. And if they do, they'll typically handle it gracefully: your SSD might copy files over a little bit slower, or your printer might take a bit longer to load the document you sent it.

With a USB camera, though, there's no good way to handle a lack of bandwidth: it doesn't have the memory to delay frames and send them later, it can't send a smaller-sized image since the software expects a specific resolution, and there's no way for it to coordinate with other cameras to share the bandwidth by slowing down framerate. Thus, you get nothing.

Let's take a step back and look at what this means in practice. A USB controller is the "root" of the tree of USB devices. It's a chip that connects the USB bus to the CPU, either directly or via a protocol like PCIe. In practice, most computers only have one or two controllers, then use internal USB hubs to provide more physical USB ports.

For instance, a Raspberry Pi 3 has four physical USB ports but only a single internal USB controller, meaning all of the ports share bandwidth. As a result, you're unlikely to get more than two USB cameras working in YUYV mode at this resolution simultaneously on the Pi.

USB 3.0

However, most modern devices have USB 3.0 ports. Can those help us out? USB 3.0 has a theoretical maximum rate of 500 Mbyte/s, so we're good, right?

An incorrect diagram showing the oversaturated link "upgraded" to USB 3.0 by replacing the hub and controller with USB 3.0 variants

Unfortunately, this is not how this works.

USB 2.0 uses a single differential pair of wires for signaling. USB 3.0 adds extra wires to the cable and connector, for a total of 3 differential pairs. The two additional pairs are used for "SuperSpeed" signaling (the fast data rates), but the original pair is still just a regular USB 2.0 connection. USB 3.0 traffic is handled completely separately from USB 2.0 traffic.

In essence, a "USB 3.0 cable" is a USB 3.0 cable and a USB 2.0 cable in the same housing. Consequently, a "USB 3.0 hub" is a USB 3.0 hub and a USB 2.0 hub in the same box, wired up to different pins in the same connectors. And you guessed it, a "USB 3.0 controller" is just a USB 3.0 controller and a USB 2.0 controller in the same chip.

Here's the output of lsusb -t on a Raspberry Pi 4:

/: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 5000M
/: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/1p, 480M
|__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 480M

You can see that despite the Pi 4 having a single USB controller chip, it appears in Linux as two entirely separate USB buses -- one for USB 2.0 (the "480M" is the speed of the port in bits) and the other for USB 3.0 (listed as "5000M").

Here's a diagram that more accurately captures what's going on:

the diagram showing the oversaturated link, but with a separate USB 3.0 link off to the side with zero utilization

Since the cameras are all USB 2.0, all of the data they send stays on the USB 2.0 pair all the way to the controller. (USB 3.0 cameras exist, but are expensive and rare outside of industrial settings.)

MJPG encoding

Your webcam probably runs at much higher than 640x480p. The popular Logitech C920, for instance, supports 1080p video at 30 fps and runs over USB 2.0, which seems like it'd use up more bandwidth than we have, right? Something doesn't add up.

1920 pixels×1080 pixels×16 bits×30 fps995.3 Mbit/s124.4 Mbyte/s>53.2 Mbyte/s1920 \text{ pixels} \times 1080 \text{ pixels} \times 16 \text{ bits} \times 30 \text{ fps} \approx 995.3 \text{ Mbit/s} \approx 124.4 \text{ Mbyte/s} \gt 53.2 \text{ Mbyte/s}

These cameras support higher resolutions and framerates by compressing frames before sending them to the computer, using a technique known as MJPG (motion JPEG). Frames are captured by the camera, compressed using the JPEG algorithm, and then sent along the wire to the computer, at which point they can be decompressed.

Different cameras will use different compression ratios depending on the resolution and framerate, meaning we don't have hard numbers to go off of. The improvement from motion JPEG in a multi-camera setup is unfortunately not that great: compression ratios are set assuming a given camera is the only device on the bus, so the bandwidth reduction at smaller resolutions is much less significant. In practice, I've found that up to 3 cameras can coexist on a single bus at this resolution and framerate, meaning it's a pretty minor improvement.

Your Toolbox

Here are the tools you have to try to solve the USB bandwidth problem:

  • Add more devices, and split up the cameras between them. NU ROVER did this initially, but it complicates your entire system quite a bit -- you now have to deal with a network switch, plus however many extra devices you brought in.
  • Find a way to add more USB controllers to a particular device. When we upgraded our systems to the Jetson Orin, we added a USB expansion card into the PCIe slot. In particular, we chose a "Quad Chip" model which contained four separate USB controllers. These are popular among VR enthusiasts -- it turns out connecting many stationary VR tracking towers leads to similar bandwidth issues as USB cameras.
  • Switch to the MJPG mode instead of the YUYV one. This adds a bit of latency since the device now has to decode the MJPG image, but the bandwidth improvements are often significant enough to make it worthwhile. Plus, the MJPG modes are typically more similar across camera models -- some cameras can be picky about what framerates they accept for YUYV streaming, for instance.
  • Plan which camera connects to which controller carefully. If you plan on dynamically starting and stopping streams, and you know you'll never want two specific cameras running simultaneously, you can put them on the same controller and they'll never have to fight over bandwidth. Conversely, if there's a set of cameras you plan on using all together, make sure not too many of them are on the same controller.
  • Apply some driver tweaks to squeeze out a little extra bandwidth. The Linux UVC driver provides a mode (UVC_QUIRKS_FIX_BANDWIDTH) that calculates the bandwidth requirements for each device itself, which can sometimes be more accurate than what the camera reports.

Encoding

You've gotten an image from the camera. Now what?

You'll probably want to use GStreamer to link together most of your pipeline. GStreamer is a framework that allows you to chain together different video elements, such as sources, encoders, payloaders, and decoders, into a pipeline that is executed all at once. The concepts in this post aren't GStreamer-exclusive, and I won't be providing many "ready-to-go" pipelines since they vary based on hardware encoding support and system, but I will at times assume that you're linking together your source, encoder, and payloader using GStreamer.

The most common codec for video compression is H.264, which is generally good enough for most purposes. Other codecs like H.265, AV1, and VP9 can offer lower bandwidth, but the benefits are minor at smaller video resolutions. The most important factor is hardware encoding and decoding support; if the encoding is handled by the CPU, it may not keep up with many video streams simultaneously, especially on lower-end devices.

The most important factors to tune are bandwidth and keyframe interval. Bandwidth, in this case, is the amount of video data sent over the network per second. If your bandwidth is set too high, you'll start to see artifacts as packets are dropped whenever your network link slows down (for instance, if you have a wireless link and your robot drives behind a wall). If it's set too low, your video quality will suffer and your feed will be full of compression artifacts. If you're operating over a wired Ethernet connection, you can probably divide the bandwidth of the Ethernet link by however many streams you plan to have simultaneously, but take into account any other network traffic on the machine, too. NU ROVER uses a bandwidth of 1 Mbps for each 480p stream.

Keyframe interval, also known as "I-frame interval," refers to the time in between I-frames in the video signal. H.264 encoded video consists of I-frames (frames which contain a complete image), P-frames (which reference previous frames), and B-frames (which can reference future frames and are only relevant in the context of a pre-recorded video). The purpose of P-frames is to improve the video compression by not sending the same data in every frame -- if your camera feed is mostly a still image, it is wasteful to send that entire image on every frame when you can instead only send the specific areas of the image which changed.

If you have too many I-frames, you will devote too much of your bandwidth to them and have worse picture quality as a result. However, if you have too few, you may have periods where you can't see the entire image if an I-frame gets dropped due to network conditions. This usually manifests itself as a gray screen in which moving areas gradually return to color while motionless areas remain gray until the next I-frame. NU ROVER uses 1 keyframe every 30 seconds.

Protocols

Now that you have your encoded video feed, you'll need to find a way to send it over the network. There are a few existing protocols that can help you here. Of course, you'll need to use the same protocol on both sides of the connection.

Motion JPEG

Motion JPEG, also known as MJPG, also works for sending video over a network, frame by frame, usually over HTTP. Each frame is compressed individually, making it one of the simplest protocols. However, because there's no inter-frame compression, this protocol uses much more bandwidth to stream video at a given quality than others.

HTTP Live Streaming

HTTP Live Streaming (HLS) is a protocol used to stream H.264 videos over the internet. It works by dividing the video into chunks, then serving each chunk in turn using HTTP. This chunking process introduces a high amount of latency, making HLS unsuitable for a low-latency feed.

Real-time Transport Protocol

Real-time Transport Protocol (RTP) is one of the simplest ways to send video over a network. It sends data over UDP, meaning network errors and dropped packets are not corrected like they would be with a TCP-based protocol. This is ideal for low-latency applications -- if one frame isn't received correctly, you don't want to spend time retransmitting it when you could instead be transmitting the next frame. However, RTP could be a poor choice for streams where quality is more important than latency.

RTP works by payloading data into packets and sending them to a predefined UDP host and port. This payloading process will look different depending on the codec you choose -- for now, let's assume we're using H.264. GStreamer provides the rtph264pay plugin to payload H.264 data, as well as the rtph264depay plugin to depayload it on the other end.

For example, let's assume the video server is on 192.168.1.10, the client is on 192.168.1.11, and the video stream should be on port 9090. We can set up an RTP stream by running this GStreamer pipeline on the server:

gst-launch-1.0 videotestsrc ! video/x-raw,rate=30,width=640,height=480 ! x264enc tune=zerolatency ! rtph264pay ! udpsink host=192.168.1.11 port=9090
  • videotestsrc is just a generic test video source (color bars)
  • The video/x-raw part forces the output of videotestsrc to have the specified resolution and framerate
  • x264enc is the encoder (which runs on the CPU -- something to avoid)
  • rtph264pay payloads the stream for RTP
  • udpsink sends those payloaded packets over the network to the specified host and port (our client)

Then, on our client, we can run this to decode the stream:

gst-launch-1.0 udpsrc port=9090 ! application/x-rtp,payload=96 ! rtph264depay ! avdec_h264 ! autovideosink
  • udpsrc reads incoming packets on a particular UDP port
  • The application/x-rtp part says that those packets should be interpreted as an RTP stream
  • rtph264depay turns the packets into a raw H.264 stream
  • avdec_h264 decodes the H.264 stream (on the CPU)
  • autovideosink displays the stream

To scale this to multiple camera feeds, we just need to choose different UDP ports for each feed. There are plenty of available port numbers for this.

Real Time Streaming Protocol

The solution above using RTP leaves a few things to be desired:

  • Having to manually assign port numbers is annoying
  • Starting or stopping streams must be done by manually launching or killing processes on the server
  • RTP (in mode 96) doesn't work with VLC or other standard media players
  • The server must know the IP address of the client, not the other way around

Real Time Streaming Protocol, or RTSP, is a protocol built on top of RTP which can solve these shortcomings. RTSP at first behaves much like an HTTP server, and RTSP URLs look similar to HTTP URLs.

Let's look at what happens when a client tries to load rtsp://192.168.1.10/video:

  • The client opens a connection to 192.168.1.10 on port 554 (the default RTSP port)
  • The client sends an OPTIONS request to the server for /video
    • The server responds with a list of request types accepted (usually OPTIONS, DESCRIBE, PLAY, TEARDOWN...)
  • The client sends a DESCRIBE request for /video
    • The server responds with a list of available video formats
  • The client sends a SETUP request for /video, and provides a pair of open ports: an even-numbered port for receiving video data, and an odd-numbered port for receiving UDP control signals using RTCP (RTP Control Protocol).
    • The server responds with its own pair of ports
  • The client sends a PLAY request for /video
    • The server begins sending video data to the client on the specified ports
  • The client sends a TEARDOWN request for /video
    • The server stops sending video data to the client

RTSP provides a few other methods, including PAUSE, which can be useful for interruptable streams.

To play around with RTSP, you can use the gst-rtsp-launch project:

gst-rtsp-launch "( videotestsrc ! video/x-raw,rate=30,width=640,height=480 ! x264enc tune=zerolatency ! rtph264pay pt=96 name=pay0 )"

Then, connect with GStreamer's rtspsrc plugin:

gst-launch-1.0 rtspsrc location=rtsp://127.0.0.1:8554/video latency=0 ! rtph264depay ! avdec_h264 ! autovideosink

Or, connect with VLC by entering the RTSP URL into the "Open Network Stream" window.

If you're building out a system with multiple video streams, you'll want to use the GStreamer bindings to define your own RTSP server's request handling logic. Here's an example of how to use the Python bindings, but bindings for C++ and other languages also exist. NU ROVER has our own in-house implementation which we may decide to open-source once it stabilizes, assuming we have time to do so.

Extra Signaling Logic

One limitation of RTSP is that by default there is no way for a client to list the streams available on the server. If you don't want to maintain a list of streams on both sides of the connection, you may want to add some extra logic to handle this.

One approach would be to try to send this data over the RTSP connection somehow, say, by returning it in response to a DESCRIBE request for a certain URL. Doing this would require the ability to embed this handling logic into whatever RTSP client and server you use, which could be difficult.

The approach NU ROVER took instead was to use another communication channel which we already had (ROS messages) to send this data. However, you could rely on an HTTP API, a database, or some other method. You might want such a protocol to handle:

  • Listing available stream paths on the server
  • Sending metadata for browsing (stream name, description, etc)
  • Sending technical metadata (resolution, framerate, camera orientation, bandwidth requirements, etc)
  • Handling "exclusivity" of streams (e.g. if two streams are from the same camera but at different resolutions, and only one can be used at a time)
  • Sending calibration data and exact positions for relevant cameras, to enable computer vision uses on the other end of the RTSP stream
  • Enabling or disabling baked-in overlays in the video signals (timestamp, camera name, etc)
  • Pausing the video stream to take high-resolution "snapshots" of the video feed

Clients

What client are you going to use to decode and display the video signal? You'll likely want your video feeds to be integrated into whatever control UI you use for other operations, so this will depend on your situation.

VLC

VLC Media Player is a common program for playing various media files, and it can play video from an RTSP source. Go to "File" → "Open Network Stream" and enter your RTSP URL.

VLC is typically meant for watching movies or other prerecorded content, meaning it buffers video to smooth out playback, dramatically increasing latency. While there are ways to reduce the latency (I couldn't find a good comprehensive source, so just search for "VLC RTSP low latency"), I haven't had much success. I'd suggest keeping VLC around to help with troubleshooting streams, but not to use it in your "production" system.

GStreamer

GStreamer's rtspsrc can also load video data over RTSP, and its glimagesink plugin can display video data in a window, making it functional as a standalone RTSP viewer. (Remember to specify latency=0 for rtspsrc.) However, if you want your video feed to be displayed in the same window as the rest of your UI, you'll likely need something more complex.

Qt, etc.

Bindings often exist to display media from a GStreamer pipeline in applications built with frameworks like Qt (QtGStreamer), React Native (react-native-gstreamer), and others. You can find a pipeline that works by launching with the gst-launch-1.0 command, then copy it into your application.

Browsers

Web-based user interfaces are becoming much more popular, and NU ROVER has our own based on React. Of course, you can't run arbitrary C++ code inside of a browser sandbox, nor can you access UDP/TCP ports directly, so embedding a GStreamer widget directly in the browser isn't feasible.

Browser Specifics

Sending low-latency video to a web browser is more difficult than it would seem. Browsers are built for media consumption, meaning they often buffer video to smooth out playback at the cost of higher latency. There are quite a few ways to include videos in a web browser, none optimal.

HTTP Live Streaming

HLS seems like it's exactly right for this use case, but the required "chunking" of video data adds a high amount of latency, making it unusable for a low-latency video feed.

Raw H.264

It is possible to directly send an H.264 stream to the browser. However, pre-existing solutions for streaming H.264 are difficult to come by. One example is the ROS web_video_server package, which implements its own HTTP server to handle this.

The benefit of this approach is that video can be embedded in a website with a <video> tag, and the video is compressed when entering the browser, reducing bandwidth use and allowing the browser to choose the most efficient decoder for the platform. The important drawback is that there is no way to control the amount of buffering that the browser does to the video feed, and browsers like to buffer a significant chunk of frames to ensure smooth playback. NU ROVER experimented with using JavaScript to automatically seek the video to the most recent frame in the buffer, but found this difficult and unreliable.

Motion JPEG

One potential approach is to run a GStreamer pipeline on the client which decodes the H.264 data, then compresses each frame as a JPEG, then sends those into the browser.

Sending a series of images to the browser can be done using the multipart/x-mixed-replace mimetype: the server will send an HTTP response with Content-Type: multipart/x-mixed-replace;boundary=--<boundary_name>, then send a series of MJPG frames, separated by --<boundary_name> lines.

This is the approach NU ROVER currently uses. We've found that encoding and decoding JPEG images adds a small but acceptable amount of latency. We use an in-house project to serve these images, starting and stopping the underlying GStreamer pipelines and RTSP streams on demand.

A tip for using GStreamer with an application like this: to send buffers from a GStreamer pipeline into your application, you'll want to use appsink. Here's an example of code that uses this.

Motion BMP?

In theory, you could send a multipart/x-mixed-replace payload with any image format, not just JPEG. Sending an uncompressed bitmap image (.bmp), for instance, would remove the need to encode and decode the JPEG but would increase the data sent to the browser. In our case, we're running the client on the same machine as the browser, meaning this added data transfer is largely irrelevant.

This approach is a bit harder to implement than Motion JPEG: effectively, you just need to add the proper headers to make each frame a BMP file, but we haven't found a good GStreamer plugin that does this. In theory, this could be done in the HTTP server when the GStreamer buffers are read.

Stream Limit

If you try to run more than six concurrent Motion JPEG streams into a browser, you'll notice that beyond the first six, none of them load. To prevent creating too many HTTP connections when a site loads, browsers will restrict the number of concurrent HTTP connections to a single host to six.

In Firefox, it's possible to change this setting by going to about:config and changing the network.http.max-persistent-connections-per-server setting from 6 to a higher value, say, 20. However, there is no way to change this setting in Chrome.

One workaround, if you have a DNS server handy or can modify /etc/hosts, is to configure multiple subdomains for the same server, as described in this StackOverflow answer.

An example /etc/hosts file could look like this:

cameras1.local 192.168.1.10
cameras2.local 192.168.1.10
cameras3.local 192.168.1.10
cameras4.local 192.168.1.10

Alternatively, a DNS server could be used with a wildcard record to allow any subdomain to work.

Then, ensure that no more than six streams are loaded using the same subdomain, and all streams should work fine.

WebRTC

One potential approach is using WebRTC, which is built for low-latency applications like video conferencing. We haven't investigated this approach yet for a few reasons. WebRTC is primarily aimed at browser-based applications exchanging data with each other, not receiving data from a non-browser application, so example code for non-browser environments streaming into browsers is hard to find. WebRTC is also built around the assumption that it needs to traverse NAT using a STUN server -- not only is that not necessary for us, it's impossible as our network runs completely offline. That said, the use of WebRTC for low-latency streaming holds potential. I'd love to see this investigated, and hope I'll get the time to play around with it myself at some point.

Conclusion

We've walked through the steps in building a camera streaming pipeline, from the choice of individual camera modules to the framework used to display them to the user. This guide is the resource I wish our team had when developing our streaming pipeline, and covers many of the pitfalls and decisions that will need to be made. It's far from complete; there are many techniques and topics in streaming to explore that simply fall outside of what we've worked with, but I hope it's useful in whatever system you're building nonetheless.

]]>
https://breq.dev/2023/06/21/cameras /2023/06/21/cameras Wed, 21 Jun 2023 00:00:00 GMT
<![CDATA[LDSP: low-level audio programming with rooted Android phones]]>

A few weeks ago, I achieved one of my dreams: being on a team that publishes a paper from doing something cool. Here it is in full (shoutout NIME for being open-access):

You can see my cameo as a hand model in Figure 5.

Our Vision

The project was pitched to me by my advisor roughly like this:

  • Boards like the Raspberry Pi, Bela, etc. are excellent tools that enable people to learn audio programming and make creative instruments using sensor data.
  • These boards can be expensive, and are especially difficult to find in certain parts of the world like Latin America.
  • A commodity smartphone, even an older one, has plenty of sensors, processing power, and audio capabilities to support this use case.
  • Old smartphones, specifically Android ones, are quite readily available in most parts of the world.
  • We just need to find a way to get a software environment low-latency audio I/O on these devices and package it into something easy to use.

My Work

Our approach was to circumvent the JVM entirely by rooting the phones, building binaries for them on a host computer, and executing these binaries directly. This gave us more direct access to the hardware, at the cost of a more difficult setup process. I focused on two main goals: our sensor I/O and our build system.

Sensor I/O

This work involved working with libandroid, Android's internal library used for handling device functions. I spent a ton of time in Android Code Search to investigate how everything fits together, and started to put together some test functions for adjusting sensor sampling rates and reading data, which my advisor then turned into LDSP's user-facing sensor API, providing functions like sensorRead -- taking heavy inspiration from the Arduino IDE's simple functions like analogRead.

Build System

Our build system draws heavily on the Android NDK, or Native Development Kit. This toolchain is designed to build smaller native components within a larger JVM-based app, so we needed to circumvent the intended way of doing things a bit.

Initially, we just used manually invoked the g++ binary included with the NDK to compile our executables. However, as the project grew and we began pulling in more and more libraries, it became clear another solution was needed. We decided to start with Makefiles, due to their popularity and simplicity. This worked for a while, and even let us include some other scripts in the file (e.g. we set up make push to push the binaries to the phone).

This solution worked well as we continued development, but started to stretch when we wanted to make our build system more user-friendly. We wanted to provide a configuration file for each phone we supported, so users would not need to configure compiler settings for their phone manually. Many phone model names contain spaces. Furthermore, we wanted to let users specify a path to their project, and compile some of their files alongside our files. Allowing user project names to contain spaces was a hard requirement for us.

As it turns out, Makefiles handle spaces extremely poorly -- not only do they have to be escaped differently in different places, but many functions treat a variable with spaces implicitly as a list, causing issues. Many make builtins simply could not handle spaces in filenames, so we were forced to use $(shell ...) and rely on shell commands (which aren't portable across operating systems). Our solution was still plagued with bugs.

Eventually, we gave up on pure Makefiles and I started to investigate other build systems. Ninja seemed promising, and has the backing of projects like Chromium and LLVM. We rolled out some CMake files across the repo, and set the output format to Ninja files, and... success!

Of course, CMake isn't necessarily built for running random commands in the way pure Makefiles are, so I created a few wrapper scripts: ldsp.sh for Linux and macOS and ldsp.bat for Windows. Each script allows configuring the project, building (with incremental builds), pushing the build to the phone, and running the build.

We got to see everything come together in an all-day workshop we did with many students from NU Sound. While the install process was a bit convoluted, most participants were able to get a toolchain up and running on their computer!

Future Goals

The biggest hurdle with our current setup is complexity: participants need the Android NDK, CMake, Ninja, ADB (Android Debug Bridge), and more installed on their computer. If we could package everything into an easy-to-install application, we could make this project more accessible to those unfamiliar with the command line or programming in general, and make it much more fitting as an educational tool. Furthermore, if we could install this package onto the phone itself, we could remove the need for a host device altogether, making our work accessible to those without a laptop or desktop.

I've been working on cross-compiling LLVM to run natively on an Android host. Admittedly, I haven't had much luck so far, but it's still early days. Once we have this crucial portion working, we can pivot to developing a web UI which runs on the phone to support editing and running code from any connected device, then package everything up into an .apk for people to install.

Thanks

I want to thank my advisor, Victor Zappi, for giving me the opportunity to work on this amazing project.

]]>
https://breq.dev/2023/06/17/ldsp /2023/06/17/ldsp Sat, 17 Jun 2023 00:00:00 GMT
<![CDATA[How To Reverse an Android App]]>

This blog post follows my journey to learn about reverse-engineering on Android over a few months. Unlike a traditional project writeup, the structure of this piece matches the process of discovery I took. Dead-ends, useless tangents, and inefficient solutions have been intentionally left in.

The Idea

I live in Boston currently, which has a wonderfully extensive transit network. To navigate between unfamiliar areas, I use an app called Transit. Transit (or TransitApp, as I tend to call it) tracks bus, train, and subway departures in realtime to provide directions and time predictions.

A screenshot of the app's interface, showing a map and a list of transit lines.

TransitApp also has a "gamification" system, in which you can set an avatar, report the location of trains from your phone, and show up on leaderboards.

Instead of public-facing usernames or profile pictures, TransitApp's social features work off of random emoji and generated names. You are given the option to generate a new emoji/name combination.

For reasons I won't get into, I really wanted the 🎈 emoji as my profile. It's not on the list of emoji that the shuffling goes through (I shuffled for quite a while to confirm). If you pay for "Transit Royale" (the paid tier), you can choose your emoji yourself. However, I was bored, so I decided to see if I could find a more fun way to get it.

My hypothesis was that the app is doing this "shuffling" logic on the client side, then sending a POST request to the server with the emoji to be chosen. If true, this would mean that I could replay that request, but with an emoji of my choosing.

Methods

Android Emulator

Because messing with a physical Android device seemed tricky, I decided to try using the Android emulator built into Android Studio. (No need to create a project: just click the three dots in the upper right and choose "Virtual Device Manager.") After picking a configuration that supported the Google Play Store (I chose the Pixel 4), I installed TransitApp and logged in without issue.

Burp Suite

Burp Suite is the standard tool for inspecting and replaying HTTP network traffic. Burp Suite creates an HTTP proxy, and is able to inspect traffic via this proxy.

I began by setting Burp to bind on all interfaces, then set the proxy in the emulated phone's settings to point at my computer's IP address and proxy port.

Burp Suite's analysis tools effectively break HTTPS, since it's the literal definition of a man-in-the-middle attack. In other words, TLS protects the connection from my phone to any cloud services, and for Burp to mess with that, it needs a way to circumvent TLS. One approach is to install Burp's certificates as a "trusted" certificate, effectively making Burp Suite able to impersonate any website it wants.

Most guides for manually installing CA certificates on Android require a rooted operating system, but with a reasonably recent OS, it's possible to install them in the phone's settings, by going to: Settings -> Security & privacy -> More settings -> Encryption & credentials -> Install a certificate -> CA certificate. This adds a "Network may be monitored" warning to the phone's Quick Settings page.

This works perfectly! Now let's just fire up TransitApp, and... nothing. It looks like TransitApp doesn't respect the user's proxy settings.

A few apps claim to be able to use a VPN profile to force all traffic over a proxy, but this seems to not work properly for any I tested.

Wireshark

Wireshark is a much more general-purpose network traffic capture tool. It supports capturing network traffic and filtering, making it relatively easy to inspect traffic.

However, Wireshark cannot circumvent TLS. As a result, even though the presence of traffic is visible, it cannot be actually inspected. When filtering for DNS traffic, though, we do get something helpful: the domain names that TransitApp uses. We see a few:

  • api-cdn.transitapp.com
  • stats.transitapp.com
  • service-alerts.transitapp.com
  • api.revenuecat.com
  • bgtfs.transitapp.com
  • api.transitapp.com

With the exception of RevenueCat, which seems to manage in-app subscriptions, all domains are *.transitapp.com.

Burp Suite Invisible Mode

Burp Suite's Invisible Mode allows the proxy to work for devices that aren't aware of its existence, by relying on the host OS to override DNS queries for the relevant domain names and send them to the proxy instead. To get this to work, we can run the Burp .jar with sudo, then set up two proxies: one on port 80 for HTTP and one on port 443 for HTTPS. Make sure to enable "invisible proxying" for both.

Since we're using invisible proxying, we'll need to explicitly tell Burp where to forward the traffic -- the HTTP(S) requests themselves won't have enough information for Burp to route them onwards. We can make a DNS request to find the IP address of the serer hosting api.transitapp.com:

$ dig @1.1.1.1 api.transitapp.com
; <<>> DiG 9.10.6 <<>> @1.1.1.1 api.transitapp.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 38753
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;api.transitapp.com. IN A
;; ANSWER SECTION:
api.transitapp.com. 248 IN A 34.102.188.182
;; Query time: 55 msec
;; SERVER: 1.1.1.1#53(1.1.1.1)
;; WHEN: Fri Jun 16 18:27:14 EDT 2023
;; MSG SIZE rcvd: 63

Configure both proxies to route traffic to that IP address, keeping the ports the same.

A screenshot of Burp Suite's proxy settings showing the stated configuration

If you've played with networking before, your first instinct will probably be to mess with the /etc/hosts file to override DNS for the domains you want to intercept. This is a pretty common technique for web-based attacks, so let's give it a try. One hiccup: we can't put wildcards directly in /etc/hosts, so we'll have to list each one individually. Worse, we'll actually have to only do one at a time, since Burp Suite can only forward traffic to one IP at a time. Start with api.transitapp.com.

Aaaand... it still doesn't work. Android is using something called Private DNS to automatically route traffic over HTTPS, meaning it can't be tampered with as easily as traditional DNS. But even if you turn that off in the emulated phone's network settings, it still doesn't work, because the emulator doesn't respect the /etc/hosts file. You'll need to run a DNS server.

First, find the IP address of the emulated phone and your computer. In the phone, go to the network settings page, then look for "IP Address" and "Gateway": those are the IPs of the phone and your computer, respectively.

Install dnsmasq on your host computer and run it (here's a nice guide for macOS). Set the Android DNS settings to point to your computer's IP. And then update your /etc/hosts entry to point to the IP address of your computer on the network instead of 127.0.0.1, since otherwise the emulated phone would attempt to connect to itself.

Even still, this doesn't work. If you try to inspect traffic, relevant features of the app will simply stop working and you will not see any connection trying to be made.

Static Analysis

Maybe some static analysis could help? We can download an APK from a definitely legitimate source, then unzip it (.apk files are just .zip files).

The first thing we can do is to start looking for URLs. We already know that some URLs start with api.transitapp.com, so let's try looking for that:

$ rg "api.transitapp.com"

No results. What about just "transitapp.com"? Still nothing.

It could be substituted in somewhere. Maybe we could look for the beginning "http" or "https" part of a URL?

$ rg http
assets/cacert.pem
9:## https://hg.mozilla.org/releases/mozilla-release/raw-file/default/security/nss/lib/ckfw/builtins/certdata.txt
okhttp3/internal/publicsuffix/NOTICE
2:https://publicsuffix.org/list/public_suffix_list.dat
5:https://mozilla.org/MPL/2.0/
META-INF/services/io.grpc.ManagedChannelProvider
1:io.grpc.okhttp.d
google/protobuf/source_context.proto
3:// https://developers.google.com/protocol-buffers/
google/protobuf/empty.proto
[many similar protobuf results omitted]
META-INF/MANIFEST.MF
494:Name: okhttp3/internal/publicsuffix/NOTICE
497:Name: okhttp3/internal/publicsuffix/publicsuffixes.gz
META-INF/CERT.SF
495:Name: okhttp3/internal/publicsuffix/NOTICE
498:Name: okhttp3/internal/publicsuffix/publicsuffixes.gz
res/56.xml
2:<resources xmlns:tools="http://schemas.android.com/tools"
res/sd.xml
6:<resources xmlns:tools="http://schemas.android.com/tools"

Okay, that's interesting. "okhttp3"? This seems like it could be related to how the application makes HTTP requests to the API. But this still leaves us with a few questions:

  • Why weren't we able to find the URLs? From a bit of research, it looks like Android apps store Java code in .dex files: the Dalvik Executable Format. (Dalvik is the name of the virtual machine used to run Android apps.) It is possible that this format uses compression or other techniques which would prevent a literal string from appearing. Running rg with the -a parameter does show some matches in a binary file, but they seem to be in a section of the file which just stores string literals, and based on the limited number of matches, it is likely that URLs are assembled at runtime (meaning we won't find a fully-formed URL in the source).

  • Why did the app refuse to connect via our monitoring setup? Here's where we need to dig in to how OkHttp works a bit more.

Certificate Pinning

OkHttp is a library made by Square for making HTTP requests on Android (or other Java platforms). The fact that it was developed by Square, a payment processing company, is a clue that they might be taking steps to secure the connection that most apps wouldn't.

Here's a snippet from the front page of the OkHttp documentation, emphasis mine:

OkHttp perseveres when the network is troublesome: it will silently recover from common connection problems. If your service has multiple IP addresses, OkHttp will attempt alternate addresses if the first connect fails. This is necessary for IPv4+IPv6 and services hosted in redundant data centers. OkHttp supports modern TLS features (TLS 1.3, ALPN, certificate pinning). It can be configured to fall back for broad connectivity.

Certificate pinning sounds relevant to what we're doing here, considering our approach is to supply an alternate certificate. So what is it? According to the OkHttp docs:

Constrains which certificates are trusted. Pinning certificates defends against attacks on certificate authorities. It also prevents connections through man-in-the-middle certificate authorities either known or unknown to the application’s user. This class currently pins a certificate’s Subject Public Key Info as described on Adam Langley’s Weblog. Pins are either base64 SHA-256 hashes as in HTTP Public Key Pinning (HPKP) or SHA-1 base64 hashes as in Chromium’s static certificates.

Our monitoring setup is such a man-in-the-middle scenario: we use our own certificate authority (in this case, Burp Suite) to "re-secure" the actual connection from TransitApp, after we've monitored and tampered with the connection.

So, how do we get around certificate pinning? We'll need to get quite a bit more serious about our decompilation efforts, since we'll need to remove the hash of the existing certificate in the code and replace that with the hash of our own certificate. I roughly followed this guide for this step.

APK Analysis

Android Studio helpfully includes an APK Analyzer tool which we can use to peek a bit deeper into the app. Create a new project,then drag and drop the TransitApp APK into the main window.

Unfortunately, method names have pretty much all been minified, so you'll see a bunch of u4, a5, o4, etc. Android Studio also doesn't let us view or modify the Java code within each method. However, note that even the OkHttp3 code seems to be minified, and I couldn't find any reference to CertificatePinner (although rg -a CertificatePinner returned a few matches, so maybe there's hope?)

An open-source tool called APKtool might save us. APKtool allows us to essentially decompile an APK into Smali (essentially an assembly listing for Java bytecode), make modifications, then recompile it. To start, let's see if we can find the certificate hash.

First, download the .jar from here, then run it with:

java -jar apktool_2.7.0.jar d Transit\ Bus\ \&\ Subway\ Times_5.13.5_Apkpure.apk

Finally, dive into the directory named after that APK. We're looking for the code that invokes CertificatePinner.

Reading Smali Bytecode: Working Up

rg CertificatePinner

This seems to give quite a few results within the okhttp3 library: the implementation of certificate pinning, special handling in the connection class, and a few other uses. However, there is one result outside that library:

smali_classes2/com/masabi/justride/sdk/platform/AndroidPlatformModule2.smali
91:.method private getCertificatePinner()Lokhttp3/CertificatePinner;
107: new-instance v1, Lokhttp3/CertificatePinner$a;
111: invoke-direct {v1}, Lokhttp3/CertificatePinner$a;-><init>()V
206: invoke-virtual {v1, v3, v4}, Lokhttp3/CertificatePinner$a;->a(Ljava/lang/String;[Ljava/lang/String;)Lokhttp3/CertificatePinner$a;
215: invoke-virtual {v1}, Lokhttp3/CertificatePinner$a;->b()Lokhttp3/CertificatePinner;
250: invoke-direct {p0}, Lcom/masabi/justride/sdk/platform/AndroidPlatformModule2;->getCertificatePinner()Lokhttp3/CertificatePinner;
258: invoke-virtual {p1, v0}, Lokhttp3/OkHttpClient$a;->d(Lokhttp3/CertificatePinner;)Lokhttp3/OkHttpClient$a;

The result on line 111 looks perhaps the most interesting to us: invoking the constructor on the CertificatePinner class. However, it doesn't seem to be passing in any sort of hash. Let's consult the OkHttp documentation as to how CertificatePinners are constructed:

String hostname = "publicobject.com";
CertificatePinner certificatePinner = new CertificatePinner.Builder()
.add(hostname, "sha256/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=")
.build();
OkHttpClient client = OkHttpClient.Builder()
.certificatePinner(certificatePinner)
.build();
Request request = new Request.Builder()
.url("https://" + hostname)
.build();
client.newCall(request).execute();

Huh, okay, so we're really looking for a CertificatePinner.Builder. But rg CertificatePinner.Builder gives no results.

Here's where a Java quirk comes into play: Nested classes like CertificatePinner.Builder are represented internally using a dollar sign in place of the dot. So we're really looking for CertificatePinner$Builder. Make sure to escape it properly:

rg 'CertificatePinner\$Builder'

And... still no results. But don't lose hope yet--remember our experiments with APK Analyzer? Perhaps the name Builder got minified. Let's try to find any nested classes within CertificatePinner:

rg 'CertificatePinner\$'

It looks like there's a CertificatePinner$a, CertificatePinner$b, and CertificatePinner$c. The docs seem to show three nested classes: Builder, Companion, and Pin. Out of the three, it seems like Builder is the only one that should need to be used externally. Looking at the one result outside of the okhttp3 package:

smali_classes2/com/masabi/justride/sdk/platform/AndroidPlatformModule2.smali
107: new-instance v1, Lokhttp3/CertificatePinner$a;
111: invoke-direct {v1}, Lokhttp3/CertificatePinner$a;-><init>()V
206: invoke-virtual {v1, v3, v4}, Lokhttp3/CertificatePinner$a;->a(Ljava/lang/String;[Ljava/lang/String;)Lokhttp3/CertificatePinner$a;
215: invoke-virtual {v1}, Lokhttp3/CertificatePinner$a;->b()Lokhttp3/CertificatePinner;

We're looking for the call to Builder.add(), since that will pass in the pin as a hash. Again, the name .add() will be minified, so we'll need to be clever. Now that we've narrowed things down to a single file, we can start to read through and look for anything interesting. This method stands out (.line directives omitted for brevity):

.method private getCertificatePinner()Lokhttp3/CertificatePinner;
.locals 8
iget-object v0, p0, Lcom/masabi/justride/sdk/platform/AndroidPlatformModule2;->sdkConfiguration:Lcom/masabi/justride/sdk/internal/models/config/SdkConfiguration;
invoke-virtual {v0}, Lcom/masabi/justride/sdk/internal/models/config/SdkConfiguration;->getCertificatePins()Ljava/util/List;
move-result-object v0
new-instance v1, Lokhttp3/CertificatePinner$a;
invoke-direct {v1}, Lokhttp3/CertificatePinner$a;-><init>()V
invoke-interface {v0}, Ljava/util/List;->iterator()Ljava/util/Iterator;
move-result-object v0
:goto_0
invoke-interface {v0}, Ljava/util/Iterator;->hasNext()Z
move-result v2
if-eqz v2, :cond_0
invoke-interface {v0}, Ljava/util/Iterator;->next()Ljava/lang/Object;
move-result-object v2
check-cast v2, Ljava/lang/String;
iget-object v3, p0, Lcom/masabi/justride/sdk/platform/AndroidPlatformModule2;->sdkConfiguration:Lcom/masabi/justride/sdk/internal/models/config/SdkConfiguration;
invoke-virtual {v3}, Lcom/masabi/justride/sdk/internal/models/config/SdkConfiguration;->getHostname()Ljava/lang/String;
move-result-object v3
const/4 v4, 0x1
new-array v4, v4, [Ljava/lang/String;
const/4 v5, 0x0
new-instance v6, Ljava/lang/StringBuilder;
invoke-direct {v6}, Ljava/lang/StringBuilder;-><init>()V
const-string v7, "sha256/"
invoke-virtual {v6, v7}, Ljava/lang/StringBuilder;->append(Ljava/lang/String;)Ljava/lang/StringBuilder;
invoke-virtual {v6, v2}, Ljava/lang/StringBuilder;->append(Ljava/lang/String;)Ljava/lang/StringBuilder;
invoke-virtual {v6}, Ljava/lang/StringBuilder;->toString()Ljava/lang/String;
move-result-object v2
aput-object v2, v4, v5
invoke-virtual {v1, v3, v4}, Lokhttp3/CertificatePinner$a;->a(Ljava/lang/String;[Ljava/lang/String;)Lokhttp3/CertificatePinner$a;
goto :goto_0
:cond_0
invoke-virtual {v1}, Lokhttp3/CertificatePinner$a;->b()Lokhttp3/CertificatePinner;
move-result-object v0
return-object v0
.end method

Let's step through what this is doing, using our intuition to bridge the gaps:

  1. Calling SdkConfiguration.getCertificatePins(), which returns a list of some type (maybe Strings?)
  2. Creating a CertificatePinner.Builder (shown here as a CertificatePinner$a)
  3. Iterating through the list of certificate pins
  4. For each pin, using a StringBuilder to assemble a hash string (starting with sha256/)
  5. Calling .add() on the CertificatePinner.Builder object with the constructed string for each pin
  6. Returning the result of calling .build() on the Builder

This means we'll need to search a little bit deeper to find the hashes we seek, starting with getCertificatePins().

$ rg getCertificatePins
smali_classes2/com/masabi/justride/sdk/converters/config/SdkConfigurationConverter.smali
454: invoke-virtual {p1}, Lcom/masabi/justride/sdk/internal/models/config/SdkConfiguration;->getCertificatePins()Ljava/util/List;
smali_classes2/com/masabi/justride/sdk/internal/models/config/SdkConfiguration.smali
762:.method public getCertificatePins()Ljava/util/List;
smali_classes2/com/masabi/justride/sdk/platform/AndroidPlatformModule2.smali
99: invoke-virtual {v0}, Lcom/masabi/justride/sdk/internal/models/config/SdkConfiguration;->getCertificatePins()Ljava/util/List;

The result with .method public is the definition of the getCertificatePins() method -- let's start there.

.method public getCertificatePins()Ljava/util/List;
.locals 1
.annotation system Ldalvik/annotation/Signature;
value = {
"()",
"Ljava/util/List<",
"Ljava/lang/String;",
">;"
}
.end annotation
iget-object v0, p0, Lcom/masabi/justride/sdk/internal/models/config/SdkConfiguration;->certificatePins:Ljava/util/List;
return-object v0
.end method

Okay, so we're just returning the certificatePins field. It's just a classic "getter method." We can track down the field definition:

.field private final certificatePins:Ljava/util/List;
.annotation system Ldalvik/annotation/Signature;
value = {
"Ljava/util/List<",
"Ljava/lang/String;",
">;"
}
.end annotation
.end field

Okay, so it's a private final List<String>. That's pretty standard. This essentially gives us two options: either these options are set in the constructor, or they're added later through another public method. Let's check the constructor first.

.method private constructor <init>(Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/util/List;Ljava/lang/String;Ljava/lang/String;Ljava/util/List;Ljava/util/List;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;ZLjava/lang/String;Ljava/lang/String;Ljava/lang/String;Z)V
.locals 2
.annotation system Ldalvik/annotation/Signature;
value = {
"(",
"Ljava/lang/String;",
"Ljava/lang/String;",
"Ljava/lang/String;",
"Ljava/util/List<",
"Ljava/lang/String;",
">;",
"Ljava/lang/String;",
"Ljava/lang/String;",
"Ljava/util/List<",
"Ljava/lang/String;",
">;",
"Ljava/util/List<",
"Ljava/lang/String;",
">;",
"Ljava/lang/String;",
"Ljava/lang/String;",
"Ljava/lang/String;",
"Ljava/lang/String;",
"Ljava/lang/String;",
"Z",
"Ljava/lang/String;",
"Ljava/lang/String;",
"Ljava/lang/String;",
"Z)V"
}
.end annotation

Oh god, that's 32 lines and we haven't even gotten to an implementation yet. Here's the part of the implementation that deals with the certificate pins:

move-object v1, p4
iput-object v1, v0, Lcom/masabi/justride/sdk/internal/models/config/SdkConfiguration;->certificatePins:Ljava/util/List;

So the pins are passed in as a list, in the fifth parameter (p4 is indexed starting at zero.) Let's keep following this wild goose chase: where is the constructor called?

$ rg 'SdkConfiguration;-><init>'
smali_classes2/com/masabi/justride/sdk/internal/models/config/SdkConfiguration.smali
211: invoke-direct/range {p0 .. p18}, Lcom/masabi/justride/sdk/internal/models/config/SdkConfiguration;-><init>(Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/util/List;Ljava/lang/String;Ljava/lang/String;Ljava/util/List;Ljava/util/List;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;ZLjava/lang/String;Ljava/lang/String;Ljava/lang/String;Z)V
smali_classes2/com/masabi/justride/sdk/internal/models/config/SdkConfiguration$Builder.smali
379: invoke-direct/range {v2 .. v21}, Lcom/masabi/justride/sdk/internal/models/config/SdkConfiguration;-><init>(Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/util/List;Ljava/lang/String;Ljava/lang/String;Ljava/util/List;Ljava/util/List;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;ZLjava/lang/String;Ljava/lang/String;Ljava/lang/String;ZLcom/masabi/justride/sdk/internal/models/config/SdkConfiguration$1;)V

Oh, duh, it's another builder. At least this one isn't minified? Let's take a look at the SdkConfiguration$Builder.smali file. Alongside brand, country code, and other parameters, we see a few points of interest.

Here's the definition of the certificatePins field on the builder. Note that it isn't marked final, meaning it probably gets assigned to somewhere.

.field private certificatePins:Ljava/util/List;
.annotation system Ldalvik/annotation/Signature;
value = {
"Ljava/util/List<",
"Ljava/lang/String;",
">;"
}
.end annotation
.end field

Next in the file is the .build() method. It does some checks for null, but other than that, seems pretty boring. Scrolling down a bit, though, we see a definition for setCertificatePins():

.method public setCertificatePins(Ljava/util/List;)Lcom/masabi/justride/sdk/internal/models/config/SdkConfiguration$Builder;
.locals 0
.annotation system Ldalvik/annotation/Signature;
value = {
"(",
"Ljava/util/List<",
"Ljava/lang/String;",
">;)",
"Lcom/masabi/justride/sdk/internal/models/config/SdkConfiguration$Builder;"
}
.end annotation
iput-object p1, p0, Lcom/masabi/justride/sdk/internal/models/config/SdkConfiguration$Builder;->certificatePins:Ljava/util/List;
return-object p0
.end method

There's no surprise here regarding what this method does (it just assigns the argument to the certificatePins field). However, now that we know the name of this method, we can try to look for it elsewhere.

$ rg setCertificatePins
smali_classes2/com/masabi/justride/sdk/converters/config/SdkConfigurationConverter.smali
199: invoke-virtual {v2, v3}, Lcom/masabi/justride/sdk/internal/models/config/SdkConfiguration$Builder;->setCertificatePins(Ljava/util/List;)Lcom/masabi/justride/sdk/internal/models/config/SdkConfiguration$Builder;
smali_classes2/com/masabi/justride/sdk/internal/models/config/SdkConfiguration$Builder.smali
770:.method public setCertificatePins(Ljava/util/List;)Lcom/masabi/justride/sdk/internal/models/config/SdkConfiguration$Builder;

It looks like the only place setCertificatePins is called is within SdkConfigurationConverter. This is an interesting class, and it's not immediately clear what it's doing. A few method names give us a clue, however:

  • convertJSONObjectToModel(JSONObject p1)
  • convertModelToJSONObject(SdkConfiguration p1)

It would be really easy if the certificate pins were just stored in a JSON file somewhere... But find . -type f -name "*.json" comes up empty.

One quick sanity check: Is SdkConfigurationConverter even constructed? It could be that we're looking in the complete wrong part of the code here. Maybe our assumption about certificate pinning isn't correct after all?

Note that SdkConfigurationConverter has a private constructor and a public static create() method. Therefore, we should be searching for SdkConfigurationConverter.create().

$ rg 'SdkConfigurationConverter;->create'
smali_classes2/com/masabi/justride/sdk/jobs/config/ProcessConfigurationDataJob.smali
1096: invoke-static {}, Lcom/masabi/justride/sdk/converters/config/SdkConfigurationConverter;->create()Lcom/masabi/justride/sdk/converters/config/SdkConfigurationConverter;

Okay, so it's used somewhere at least. Is ProcessConfigurationDataJob invoked anywhere? It looks like it has a create() method, just like SdkConfigurationConverter.

$ rg 'ProcessConfigurationDataJob;->create'
smali_classes2/com/masabi/justride/sdk/AndroidJustRideSdkBuilder.smali
121: invoke-static {v0}, Lcom/masabi/justride/sdk/jobs/config/ProcessConfigurationDataJob;->create(Lcom/masabi/justride/sdk/platform/crypto/PlatformSignatureVerifier;)Lcom/masabi/justride/sdk/jobs/config/ProcessConfigurationDataJob;

And just to finish going up the chain, where is this instantiated?

$ rg AndroidJustRideSdkBuilder
[...]
smali_classes2/com/thetransitapp/droid/shared/TransitApp.smali
524: invoke-static {}, Lcom/masabi/justride/sdk/AndroidJustRideSdk;->builder()Lcom/masabi/justride/sdk/AndroidJustRideSdkBuilder;
532: invoke-virtual {p1, p0}, Lcom/masabi/justride/sdk/AndroidJustRideSdkBuilder;->application(Landroid/app/Application;)Lcom/masabi/justride/sdk/AndroidJustRideSdkBuilder;
540: invoke-virtual {p1, v0}, Lcom/masabi/justride/sdk/AndroidJustRideSdkBuilder;->configuration(Ljava/io/InputStream;)Lcom/masabi/justride/sdk/AndroidJustRideSdkBuilder;
548: invoke-virtual {p1}, Lcom/masabi/justride/sdk/AndroidJustRideSdkBuilder;->build()Lcom/masabi/justride/sdk/AndroidJustRideSdk;
[...]

Here we are: TransitApp.smali, which seems like the entrypoint to the application. It seems like it passes some sort of InputStream to the builder--maybe this is the JSON data we're looking for?

Here's the method that invokes this (it's just labeled j, since we're back into minified code):

.method private synthetic j(Ljava/lang/String;Ljava/lang/Throwable;)V
.locals 2
if-eqz p2, :cond_0
return-void
:cond_0
const/4 p2, 0x0
:try_start_0
new-instance v0, Ljava/io/ByteArrayInputStream;
invoke-virtual {p1}, Ljava/lang/String;->getBytes()[B
move-result-object p1
const/4 v1, 0x0
invoke-static {p1, v1}, Landroid/util/Base64;->decode([BI)[B
move-result-object p1
invoke-direct {v0, p1}, Ljava/io/ByteArrayInputStream;-><init>([B)V
:try_end_0
.catch Ljava/lang/Exception; {:try_start_0 .. :try_end_0} :catch_1
.catchall {:try_start_0 .. :try_end_0} :catchall_1
:try_start_1
invoke-static {}, Lcom/masabi/justride/sdk/AndroidJustRideSdk;->builder()Lcom/masabi/justride/sdk/AndroidJustRideSdkBuilder;
move-result-object p1
invoke-virtual {p1, p0}, Lcom/masabi/justride/sdk/AndroidJustRideSdkBuilder;->application(Landroid/app/Application;)Lcom/masabi/justride/sdk/AndroidJustRideSdkBuilder;
move-result-object p1
invoke-virtual {p1, v0}, Lcom/masabi/justride/sdk/AndroidJustRideSdkBuilder;->configuration(Ljava/io/InputStream;)Lcom/masabi/justride/sdk/AndroidJustRideSdkBuilder;
[...]

Okay, so v0 is our InputStream. We call getBytes() on the string parameter, then Base64 decode it, then pass that into the ByteArrayInputStream constructor. So where is the string passed into j? Searching just that file (assuming it's the entry point) for ->j gives another method, b:

.method public static synthetic b(Lcom/thetransitapp/droid/shared/TransitApp;Ljava/lang/String;Ljava/lang/Throwable;)V
.locals 0
invoke-direct {p0, p1, p2}, Lcom/thetransitapp/droid/shared/TransitApp;->j(Ljava/lang/String;Ljava/lang/Throwable;)V
return-void
.end method

It looks like this is a bridge method, used to support generics.

Now, is this method called anywhere? Doing a regex search in invoke-static \{.*\}, Lcom/thetransitapp/droid/shared/TransitApp doesn't find anything related to it.

Reading Smali Bytecode: Working Down

Maybe we missed something somewhere. Doing a bit of research, it looks like the entrypoint to an Android application is in a class that inherits from Application and overrides the onCreate method. Does our TransitApp class fit the bill? Let's check. Right at the top of the file, we see:

.class public Lcom/thetransitapp/droid/shared/TransitApp;
.super Landroid/app/Application;
.source "TransitApp.java"
# interfaces
.implements Lac/b;

Okay, so that .super line confirms it. Let's look at the onCreate() method. It's quite long, so let's break it down.

.method public onCreate()V
.locals 4
invoke-super {p0}, Landroid/app/Application;->onCreate()V

We start by calling the base Application's implementation of onCreate() -- pretty standard stuff for inheritance. We also have four local variables.

invoke-static {p0}, Landroidx/emoji2/text/c;->a(Landroid/content/Context;)Landroidx/emoji2/text/j;
move-result-object v0
const/4 v1, 0x1
if-eqz v0, :cond_0
invoke-virtual {v0, v1}, Landroidx/emoji2/text/e$c;->c(I)Landroidx/emoji2/text/e$c;
move-result-object v0
invoke-virtual {v0, v1}, Landroidx/emoji2/text/e$c;->d(Z)Landroidx/emoji2/text/e$c;
move-result-object v0
new-instance v2, Lcom/thetransitapp/droid/shared/TransitApp$a;
invoke-direct {v2, p0}, Lcom/thetransitapp/droid/shared/TransitApp$a;-><init>(Lcom/thetransitapp/droid/shared/TransitApp;)V
invoke-virtual {v0, v2}, Landroidx/emoji2/text/e$c;->b(Landroidx/emoji2/text/e$e;)Landroidx/emoji2/text/e$c;
move-result-object v0
invoke-static {v0}, Landroidx/emoji2/text/e;->g(Landroidx/emoji2/text/e$c;)Landroidx/emoji2/text/e;
:cond_0

AndroidX, also known as Jetpack, is a set of Android libraries provided by Google to handle common tasks. androidx.emoji2 is a library to support modern emoji on older platforms, including text rendering and emoji pickers. The method calls here are minified, but we can safely rule this out as uninteresting for now. That said, the if statement that seems to construct a new instance of TransitApp definitely strikes me as odd.

invoke-virtual {p0}, Landroid/content/Context;->getCacheDir()Ljava/io/File;
move-result-object v0
invoke-static {v0}, Lcom/thetransitapp/droid/shared/data/NetworkHandler;->setCacheDir(Ljava/io/File;)V

This seems to be setting the directory to store cached assets.

invoke-static {p0}, Lmb/a;->a(Landroid/content/Context;)Landroid/content/SharedPreferences;
move-result-object v0
iput-object v0, p0, Lcom/thetransitapp/droid/shared/TransitApp;->a:Landroid/content/SharedPreferences;

This code uses the SharedPreferences class in some form, likely to retrieve some form of user preferences and store a handle to them for later.

invoke-static {p0}, Lcom/thetransitapp/droid/shared/util/j2;->c(Landroid/content/Context;)I
move-result v0
invoke-virtual {p0}, Landroid/content/Context;->getTheme()Landroid/content/res/Resources$Theme;
move-result-object v2
invoke-virtual {v2, v0, v1}, Landroid/content/res/Resources$Theme;->applyStyle(IZ)V

This handles differentiating between dark and light theme.

invoke-direct {p0}, Lcom/thetransitapp/droid/shared/TransitApp;->d()V
invoke-static {p0}, Lcom/thetransitapp/droid/shared/util/z2;->i(Landroid/content/Context;)V
invoke-static {}, Lcom/thetransitapp/droid/shared/util/z2;->g()Ljava/lang/String;
move-result-object v0
invoke-static {}, Lcom/google/firebase/crashlytics/FirebaseCrashlytics;->getInstance()Lcom/google/firebase/crashlytics/FirebaseCrashlytics;
move-result-object v2
invoke-virtual {v2, v1}, Lcom/google/firebase/crashlytics/FirebaseCrashlytics;->setCrashlyticsCollectionEnabled(Z)V
invoke-virtual {v2, v0}, Lcom/google/firebase/crashlytics/FirebaseCrashlytics;->setUserId(Ljava/lang/String;)V

This code grabs some sort of user ID, stores it in v0, then uses it to set up the Firebase Crashlytics crash reporter.

const-string v1, "com.thetransitapp"
filled-new-array {v1}, [Ljava/lang/String;
move-result-object v1
invoke-static {v1}, Lz2/b;->a([Ljava/lang/String;)V
new-instance v1, La3/k;
invoke-direct {v1}, La3/k;-><init>()V
invoke-virtual {v1}, La3/k;->b()La3/k;
move-result-object v1
invoke-virtual {v1}, La3/k;->d()La3/k;
move-result-object v1
invoke-virtual {v1}, La3/k;->c()La3/k;
move-result-object v1
invoke-static {}, La3/a;->a()La3/d;
move-result-object v2
invoke-virtual {v2, v1}, La3/d;->f0(La3/k;)La3/d;
move-result-object v1
const-string v2, "3687b056476e15e4fe1b346e559a4169"
invoke-virtual {v1, p0, v2, v0}, La3/d;->A(Landroid/content/Context;Ljava/lang/String;Ljava/lang/String;)La3/d;
move-result-object v1
invoke-virtual {v1}, La3/d;->q()La3/d;
move-result-object v1
invoke-virtual {v1, p0}, La3/d;->r(Landroid/app/Application;)La3/d;
move-result-object v1
const-wide/32 v2, 0xea60
invoke-virtual {v1, v2, v3}, La3/d;->c0(J)La3/d;

Checking the a3/a.smali file, we get this header:

.class public La3/a;
.super Ljava/lang/Object;
.source "Amplitude.java"

Amplitude looks like some kind of analytics application, which makes sense for something that would be set up in an onCreate() call.

const/4 v1, 0x0
invoke-static {v1}, Lcom/revenuecat/purchases/Purchases;->setDebugLogsEnabled(Z)V
new-instance v1, Lcom/revenuecat/purchases/PurchasesConfiguration$Builder;
const-string v2, "JfhIYqEpBxRrLkgLLizTDhRqyoPguWdY"
invoke-direct {v1, p0, v2}, Lcom/revenuecat/purchases/PurchasesConfiguration$Builder;-><init>(Landroid/content/Context;Ljava/lang/String;)V
invoke-virtual {v1, v0}, Lcom/revenuecat/purchases/PurchasesConfiguration$Builder;->appUserID(Ljava/lang/String;)Lcom/revenuecat/purchases/PurchasesConfiguration$Builder;
move-result-object v0
invoke-virtual {v0}, Lcom/revenuecat/purchases/PurchasesConfiguration$Builder;->build()Lcom/revenuecat/purchases/PurchasesConfiguration;
move-result-object v0
invoke-static {v0}, Lcom/revenuecat/purchases/Purchases;->configure(Lcom/revenuecat/purchases/PurchasesConfiguration;)Lcom/revenuecat/purchases/Purchases;

This code sets up the RevenueCat integration for in-app subscriptions.

The method continues on, but nothing else in it looks all that notable -- just some error handling stuff. But looking into each of these integrations is making me realize: what exactly were we looking at before?

So, what exactly is JustRide?

Remember, we first found our JustRide method in com/masabi/justride/sdk/platform/AndroidPlatformModule2.smali. Are we sure that this is related to what we're trying to find?

Looking up JustRide, it advertizes itself as a mobile ticketing platform. They provide an SDK for integrating ticket purchases into other apps. It seems like TransitApp just includes the JustRide SDK for this functionality, so its use of certificate pinning is largely irrelevant.

Okay, back to the drawing board.

Network Security Config

Doing a bit more research, I stumbled upon a GitHub project called AddSecurityExceptionAndroid that claims to be able to enable reverse engineering. It links to a few Android reference pages.

First, it mentions Network Security Configuration, which allows application developers to implement certificate pinning using a configuration file instead of any code changes. Maybe this is what we're after? It gives an example AndroidManifest.xml file for this:

<?xml version="1.0" encoding="utf-8"?>
<manifest ... >
<application android:networkSecurityConfig="@xml/network_security_config"
... >
...
</application>
</manifest>

Let's check this against TransitApp's AndroidManifest.xml to see if it uses this. I've grabbed only the <application> tag, since the file is quite large:

<application android:allowBackup="true" android:appComponentFactory="androidx.core.app.CoreComponentFactory" android:extractNativeLibs="false" android:fullBackupContent="@xml/backup_descriptor" android:icon="@mipmap/ic_launcher" android:label="@string/app_name" android:largeHeap="true" android:localeConfig="@xml/locales_config" android:logo="@drawable/action_bar_icon" android:name="com.thetransitapp.droid.shared.TransitApp" android:roundIcon="@mipmap/ic_launcher_round" android:theme="@style/SplashScreen" android:usesCleartextTraffic="true">

I don't see a networkSecurityConfig option, and furthermore, the app specifically opts into "cleartext traffic" (HTTP). Let's keep looking.

The other linked source is a blog post from 2016. Emphasis mine:

In Android Nougat, we’ve changed how Android handles trusted certificate authorities (CAs) to provide safer defaults for secure app traffic. Most apps and users should not be affected by these changes or need to take any action. The changes include:

  • Safe and easy APIs to trust custom CAs.
  • Apps that target API Level 24 and above no longer trust user or admin-added CAs for secure connections, by default.
  • All devices running Android Nougat offer the same standardized set of system CAs—no device-specific customizations.

For more details on these changes and what to do if you’re affected by them, read on.

This is the exact approach we were trying: adding our own CA to try to pull off a man-in-the-middle attack on ourselves.

Reading further in the blog post, it seems like apps need to use a Network Security Config setting to opt into user-added CAs. Since TransitApp does not supply a Network Security Config, this is probably what's blocking us from using our own CA here.

AddSecurityExceptionAndroid

The AddSecurityExceptionAndroid script works by overwriting the network_security_config.xml file and adding the relevant attribute to the application tag in AndroidManifest.xml. Let's run it on our .apk and see if it works.

First, set up a debug keystore in Android Studio, following the documentation. I chose the following settings based on what I found in the addSecurityExceptions.sh file:

SettingValue
Key store path/Users/breq/.android/keystore.jks
Passwordandroid
Aliasandroiddebugkey
Passwordandroid
Validity25 (default)

After creating the key, you can exit out of Android Studio.

git clone https://github.com/levyitay/AddSecurityExceptionAndroid
cd AddSecurityExceptionAndroid
cp ~/Downloads/Transit\ Bus\ \&\ Subway\ Times_5.13.5_Apkpure.apk ./transit.apk
./addSecurityExceptions.sh -d transit.apk

We use -d to make the .apk debuggable -- I don't know if this will be useful or not, but there's no reason not to.

When running this, I ran into this issue:

W: /tmp/transit/AndroidManifest.xml:82: error: attribute android:localeConfig not found.
W: error: failed processing manifest.

Based on this GitHub issue, it seems like I need to make modifications to the APK before attempting to recompile it. I modified the addSecurityException.sh script to wait before recompiling:

@@ -122,6 +122,9 @@ if [ $makeDebuggable ] && ! grep -q "debuggable" "$tmpDir/AndroidManifest.xml";
mv "$tmpDir/AndroidManifest.xml.new" "$tmpDir/AndroidManifest.xml"
fi
+echo "Make any changes now in $tmpDir"
+echo "Press ENTER when done..."
+read
java -jar "$DIR/apktool.jar" --use-aapt2 empty-framework-dir --force "$tmpDir"
echo "Building temp APK $tempFileName"

Then, in another terminal, I opened the temporary directory (/tmp/transit in my case). I didn't see a locales_config.xml, but I did notice that the AndroidManifest.xml included the parameter android:localeConfig="@xml/locales_config". I removed the localeConfig parameter from AndroidManifest.xml, then continued the addSecurityException.sh script, which worked this time.

Dynamic Analysis: Take 2

We'll need to uninstall the old app from our emulator before installing the APK. Boot up the emulated device and uninstall TransitApp, then drag and drop the new APK onto the emulator.

Doing this, I get an error: INSTALL_FAILED_NO_MATCHING_ABI: Failed to extract native libraries. Actually, when I try to use the unmodified .apk I downloaded, I get the same error. Looking into /tmp/transit/lib, we only see x86_64 -- the sketchy website lied about what architecture the APK is.

We can try another sketchy site. Looking in /tmp/transit/lib now, we see arm64-v8a -- perfect. We can verify that this APK installs correctly. (Make sure you have internet access in your emulator -- I forgot to launch dnsmasq and had some issues with this.)

We can go through the same steps of running the script and removing the android:localeConfig parameter. Install this APK, and boom: our modified TransitApp build is running in our emulator. Some parts of the app don't seem to be working: maybe this is because our proxy isn't running?

Start the proxy, verify the settings in /etc/hosts, and restart dnsmasq for good measure.

Holy shit. We're finally getting something.

Here's Where The Fun Begins

We can already see some requests just from opening the app and signing in. Here's the response for /v3/users/me right now:

HTTP/2 200 OK
X-Powered-By: Express
Content-Type: application/json; charset=utf-8
Content-Length: 298
Etag: W/"12a-pknx7teDN3n3f6xhsnm1RmASA/M"
Vary: Accept-Encoding
Date: Fri, 16 Jun 2023 22:29:36 GMT
Via: 1.1 google
Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
{"subscriptions":[],"royale":{"user_id":"9702c35a82984e76","avatar":{"username":"Floppy Sensei","username_type":"generated","image_id":"💾","color":"e3131b","foreground_color":"ffffff","visibility":"public"}},"main_agencies":["MBTA"],"time_zone_name":"America/New_York","time_zone_delta":"-0400"}

Here's where we can confirm our hypothesis about the emoji shuffle being done client-side. Looking at the network logs while clicking "shuffle," nothing seems to be happening in the network console. Perfect!

We'll edit our username and icon by accepting one of the suggestions. Here's the generated request (I've changed the User ID and removed the authorization token):

PATCH /v3/users/b1dfb2047e8bd5eb HTTP/2
Host: api.transitapp.com
Accept-Language: en-US
Authorization: Basic [REDACTED]
Transit-Hours-Representation: 12
User-Agent: Transit/20900 transitLib/114 Android/13 Device/sdk_gphone64_arm64 Version/5.14.6
Content-Type: application/json
Content-Length: 196
Connection: Keep-Alive
Accept-Encoding: gzip, deflate
{"avatar":{"color":"f3a4ba","foreground_color":"804660","image_id":"🍦","image_type":"emoji","subscribed":false,"username":"Cone Extravaganza","username_type":"generated","visibility":"public"}}

And here's the response:

HTTP/2 200 OK
X-Powered-By: Express
Content-Type: application/json; charset=utf-8
Content-Length: 180
Etag: W/"b4-0vP9yEjHVjjz1iTkW1gCoYEcQG4"
Vary: Accept-Encoding
Date: Fri, 16 Jun 2023 22:32:01 GMT
Via: 1.1 google
Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
{"id":"b1dfb2047e8bd5eb","avatar":{"color":"f3a4ba","foreground_color":"804660","username":"Cone Extravaganza","username_type":"generated","image_id":"🍦","visibility":"public"}}

Let's modify the request a bit. Turn on Burp's Intercept tool, then randomize the name and icon again:

PATCH /v3/users/b1dfb2047e8bd5eb HTTP/2
Host: api.transitapp.com
Accept-Language: en-US
Authorization: Basic [REDACTED]
Transit-Hours-Representation: 12
User-Agent: Transit/20900 transitLib/114 Android/13 Device/sdk_gphone64_arm64 Version/5.14.6
Content-Type: application/json
Content-Length: 191
Connection: Keep-Alive
Accept-Encoding: gzip, deflate
{"avatar":{"color":"ffce00","foreground_color":"855323","image_id":"😐","image_type":"emoji","subscribed":false,"username":"Feral Voovie","username_type":"generated","visibility":"public"}}

Let's quickly swap out the payload for this:

PATCH /v3/users/b1dfb2047e8bd5eb HTTP/2
Host: api.transitapp.com
Accept-Language: en-US
Authorization: Basic [REDACTED]
Transit-Hours-Representation: 12
User-Agent: Transit/20900 transitLib/114 Android/13 Device/sdk_gphone64_arm64 Version/5.14.6
Content-Type: application/json
Content-Length: 191
Connection: Keep-Alive
Accept-Encoding: gzip, deflate
{"avatar":{"color":"ff42a1","foreground_color":"000000","image_id":"🎈","image_type":"emoji","subscribed":false,"username":"Brooke Chalmers","username_type":"generated","visibility":"public"}}

Things didn't update in the UI. However, if we sign out and sign back in... There we go!

Thanks for joining me on this little adventure! I worked on this blog post every now and then over the course of a few months. Getting this to work in the end was such a great feeling -- if my initial hypothesis had been wrong, I still would've learned a lot, but the payoff would've been quite a bit less fun. And if you see a 🎈 emoji rider around Boston, feel free to say hi!

]]>
https://breq.dev/2023/06/16/transitapp-reversing /2023/06/16/transitapp-reversing Fri, 16 Jun 2023 00:00:00 GMT
<![CDATA[Emulation Project - Call for Collaborators!]]>

I'm trying to emulate a bunch of 6502-based systems -- the Commodore PET, VIC-20, and 64, the Apple IIe, and the NES. I'm writing it in Rust, currently targeting desktop and WebAssembly but with plans to support mobile and embedded, too. Right now, I've got the PET working, and you can try it out here.

This is an ambitious project, and I'm seeing some exciting results, but I need your help if I'm going to have a chance at getting all of this working within a reasonable timeframe. If you like Rust, are interested in old hardware, and might have some free time soon, let me know.

A screenshot of a Commodore PET running BASIC.

The Premise

The MOS 6502 defines an era of retro computing. It was a powerful chip, priced incredibly low. This same chip was used in the Commodore 64, Apple IIe, BBC Micro, and others, and a knockoff (with an entirely compatible instruction set) was used in the Nintendo Entertainment System. Many of these computers also relied on the MOS 6520 (Peripheral Interface Adapter, or PIA) and the MOS 6522 (Versatile Interface Adapter, or VIA).

Rust, like C or C++, is notable for its wide range of potential platforms. Toolchains exist for desktop, web, mobile, and even embedded processors, leaving open the possibility of creating a handheld device built to run this emulator.

By writing an emulator that works off of a few basic building blocks -- the 6502, the PIA and VIA, some basic RAM and ROM, and some special functions like the VIC chip -- it should be possible to emulate a wide range of systems, sharing much of the code between them. And by implementing this in Rust and targeting a variety of platforms, it should be possible to emulate anything, anywhere.

The Work So Far

If you prefer reading code, take a look at the repo.

--system brooke

The first system implementation was just something I came up with for testing. A System represents the 6502 and some attached Memory, which in this case was just RAM from 0x0000 through 0x3FFF, some memory-mapped I/O at 0x4000, and just ROM from 0x8000 to 0xFFFF. I chose this configuration to be similar to Ben Eater's homemade computer1, so that I could use his toolchain and examples to test my processor.

Here's some of the assembly I wrote by hand in the early days. This program will capitalize any letter it receives:

MAPPED_STDIO = $4001
.org $8000
reset:
LDA MAPPED_STDIO
CMP #$61
BMI skip_capitalize
CMP #$7B
BPL skip_capitalize
AND #$DF
skip_capitalize:
STA MAPPED_STDIO
JMP reset
.org $fffa
vectors:
.word $0000; NMI
.word reset; RESET
.word $0000; IRQ

--platform text

How does the above program actually read and write text? This is where Platforms come in. The Platform trait provides platform-specific code to run the System, and each Memory object can keep a PlatformProvider to provide functionality like writing to the terminal, prompting for input, drawing a pixel on the screen, or checking which keys are pressed.

The Text platform is the simplest, only providing read and write capabilities through the terminal.

--system klaus

To verify that every opcode of my 6502 implementation worked, I decided to use Klaus Dormann's functional tests. This System was the harness that let me run these tests.

--system easy

Another project I leaned off of was the Easy6502 guide. This guide walks you through writing a game of Snake for a bespoke 6502 system which outputs to a 16x16 color display. I implemented this video system for my emulator, and ran the example implementation of Snake. This basic 16x16 display was substantially more simple than any real-world video circuit, so it made a perfect first implementation.

--platform winit

To actually push pixels to the screen, I landed on using pixels to plot pixels and winit to handle creating the window and handling keyboard events. I initially tried using minifb to handle both tasks, but I found it to have slightly worse performance.

--system pet

My next step was to actually implement a real computer. I chose the Commodore PET, since its simple monochrome text-mode graphics would be relatively easy to implement.

In the PET, text is placed on the screen by writing a specific character code to a specific location in the video memory. There is no color support, bitmap mode, or other frills -- it's just text mode.

The PET also uses a PIA chip, which I needed to implement. This is used to read the keyboard row, and to receive a 60Hz interrupt from the video circuitry.

I did still have to implement the keyboard, which proved slightly difficult. The keyboard layout on the PET's "graphics keyboard" (one of the two standard keyboards for the PET) is quite different from a modern computer keyboard. Notably, it places the double-quote " on a key which does not require Shift to be pressed. After adding that and a few other special cases, it just required implementing the keyboard matrix scan logic to return the correct bits for each keyboard row.

--target wasm32-unknown-unknown

A screenshot of a Commodore PET emulator running in a browser.

This is when I added support for WebAssembly. In a browser, the emulator draws to a <canvas> element, also using winit. (I'm thinking of transitioning away from winit and just directly setting up the <canvas> through JavaScript bindings.) The Easy6502 implementation works fine, and so does the PET. (The text-mode stuff also works, albeit through alert() and input() calls.)

--system vic

A screenshot of a VIC-20 displaying the BASIC startup screen.

My most recent work has been trying to emulate the VIC-20. The VIC-20 is named after the VIC chip, or the Video Interface Chip. (Specifically, it's the MOS 6560 or 6561 in NTSC and PAL regions respectively.) This chip manages the background and border colors, the sound output, the light pen, and a few other miscellaneous video-related features.

The VIC-20 also trades the PIAs for VIAs. Although the PET contained a VIA, it was only used for the IEEE-488 interface (used for disk/tape drives and storage), so I didn't implement it. The VIC-20 uses its VIAs for reading the keyboard state and for setting up a 60Hz timer, both of which are required to get a minimal working system.

The VIC-20 uses three separate areas of memory for video-related functions:

  • The screen memory stores what character is displayed at each position on the screen.
  • The character memory stores the shape of each character -- kind of like the "font" of the system.
  • The color memory stores the color code for each position on the screen.

This work is ongoing. At time of writing, the system boots to the startup screen (with color), but fails to blink the cursor or display typed characters. Work is ongoing in the bc/vic-20 branch of the repo.

The Road Ahead

My immediate goal is to get the VIC-20 working, which should happen soon. Past that, and loosely in order of priority, here's what I want to tackle:

Emulating Disk Drives

Currently, the PET can only be used for running programs that you're willing to type out at the BASIC interpreter. Emulating a disk drive will make it easier to load a wide array of software, increasing the utility of the emulator and helping to test other parts of the system. Notably, lots of Commodore machines used the same drives, which might make this easy.

Cleaning Up the WebAssembly Experience

Right now, the WebAssembly build is a somewhat manual process built on top of wasm-pack, with no automated deployment. Additionally, swapping between systems requires commenting out system-specific code.

I'd like the WebAssembly experience to be user-friendly enough for a user to select the system they would like to run from their browser. I'd also like deployment to be more automated, so pushes to the Git repository will trigger the web deployment to be up to date. This might also be a good time to remove the winit dependency for WebAssembly, and to work with the <canvas> directly. (That would also let us attach event listeners to the page itself, not just the canvas.)

The Commodore 64

I'd like to emulate the Commodore 64, due to its popularity. The only substantial difference it has to the VIC-20 is the video circuitry, so once the VIC-20 is working, this should be a pretty easy system to get running.

Notably, the Commodore 64 uses the MOS 6510, not the 6502. This adds an 8-bit I/O port.

Native Mobile Apps

While mobile users could use the WebAssembly version, the low performance of mobile devices means that the overhead of WASM makes the experience laggy. A native mobile app could also give a better keyboard experience for users.

The Apple IIe

This was another popular 6502-based computer with a rich software library. It has less in common with the Commodore machines, meaning it might be more difficult to get working.

Embedded Design Sketching

My vision is to have some physical device with physical controls and a physical display to run the emulator as firmware. I don't intend on bringing this to market, partly because I don't think there is enough demand and partly because we would have to be careful to avoid copyright issues (e.g. the kernals of the Commodore machines are still protected under copyright). That said, I want it to be inexpensive enough that we could put together a few as a proof-of-concept.

Figuring out the vision for this project requires:

  • Choosing a chip
    • Ideally we'd want one with good Rust support, like the RP2040
  • Choosing a display
    • This might be the most expensive part of the system
    • We'd want something with color, and a good "middle ground" aspect ratio
  • Drawing a schematic
  • Laying out a PCB
  • Assembly!

The Nintendo Entertainment System

The NES has a complicated video system with hardware sprites and multiple modes. It will be quite a challenge to implement. It uses the Ricoh 2A03, which is a 6502 clone but doesn't have Binary Coded Decimal support for patent-related reasons.

Cleaning Up the Desktop Experience

It might be nice to have a user-friendly GUI that allows users to choose their system and ROM before launching. It also might be nice to ship a compiled, signed executable for all platforms.

Additional Systems

Other potential 6502-bsed systems include:

  • Apple I, other members of the Apple II family
  • Acorn's various Eurocard systems
  • Atari's 8-bit family including the Atari 400 and 800

Support for additional CPUs

In the long term, it might be nice to add support for additional CPUs. Potential candidates include:

WDC 65C02, WDC 65C816, Ricoh 5A22: This family was based on the original 6502. The 65C02 removed some undocumented opcodes, added some new opcodes, and fixed some errata from the old silicon. The 65C816 made even more extensions, including 16-bit registers, but maintains binary compatibility with the 6502. Finally, the Ricoh 5A22 is a clone of the 65C816, similar to how the Ricoh 2A03 clones the 6502.

  • WDC 65C02: Apple IIc, Enhanced Apple IIe, BBC Master, Atari Lynx
  • WDC 65C816: Apple IIGS
  • Ricoh 5A22: Super Nintendo Entertainment System

8080, Z80, "GB-Z80", 8086: This family of processors was also widely used. The Z80 is an extension of the Intel 8080, and the "GB-Z80" (technically a Sharp LR35902) shares many of the same opcodes. The Intel 8086 has similar opcodes to the 8080.

  • Intel 8080: Altair 8800, Sol-20
  • Zilog Z80: ZX Spectrum (and the ZX 80 and ZX 81), TRS-80, Cambridge Computer Z88
  • "GB-Z80" / Sharp LR35902: Game Boy, Game Boy Color
  • Intel 8086: IBM PC (model 5150), IBM PS/2, Tandy 1000

Motorola 68000 This is a 16/32-bit processor with a 32-bit instruction set and a 16-bit data bus. It was used in the Macintosh, Amiga, Atari ST, Sun-1, Apple Lisa, Sinclair QL, and Sega Genesis.

Project name?

So far, I've just been calling the project "noentiendo," as a pun on Nintendo and an allusion to the fact that I didn't know much about Rust or retro computing before starting this project. I've been thinking about calling it "MoxEMU," since I really like Moxie soda. I'd love other suggestions -- maybe one will stick!

Timeline

My immediate priority is getting the VIC-20 working. I don't have an ETA on when that'll be finish, but my hope is that it'll be done by New Years. After that branch gets merged, I'd love to start working on this with a group of people. Hopefully, once the initial design is frozen, collaboration should be easy due to the modular nature of the system.

If you're interested, get in touch with me, and I can keep you up to date!

Footnotes

  1. It's also a big part of what inspired me to start this project!

]]>
https://breq.dev/2022/11/26/noentiendo /2022/11/26/noentiendo Sat, 26 Nov 2022 00:00:00 GMT
<![CDATA[MIDI LiDAR]]>

Overview

This project is a new type of MIDI controller which uses a LiDAR sensor to detect the positions of the user’s hands within two arbitrary zones, then maps this input to four separate MIDI streams. Each stream can send either note values to control pitch or Continuous Controller (CC) messages to control other parameters in a DAW.

This controller uses a single sensor placed in the middle of a playing surface (desk, table, etc). The performer sits or stands such that the sensor is directly in front of their body. On either side of the sensor, zones are marked out in software. These zones can be chosen to align with physical references, such as a printed grid. The performer positions their hands within these zones in the air above the playing surface. Then, the horizontal and vertical position of each hand is used to generate MIDI messages, which control some sound source.

The sensor located on my desk, with two reference sheets corresponding to two zones.

Motivation

My primary inspiration for this project was the theremin. I had the opportunity to play one when I was young, and I was amazed by how intuitive and inviting the instrument felt. I was quickly able to get a sense for how the pitch and volume were controlled, but I still had the sense that there were so many possibilities for the device despite the simple controls. I was also struck by how inviting the device felt. Generally, when I see an unfamiliar instrument, my first reaction is to hold my hand tentatively over it and ask, “Can I play this?” With the theremin, simply being in the same vicinity affects the sound, and when I curiously held my hand near it, I found I was already playing.

Although some highly skilled musicians such as Clara Rockmore developed techniques for the theremin that increased its versatility, the instrument is often used simply for “spooky sound effects” instead of pitched music. The ondes Martenot, a later electronic instrument using a continuous wire, had a non-functional image of a piano keyboard which provided a reference point. This instrument was just as continuous as a theremin, but the reference keyboard made it more widely used in pitched music. It also allowed for multiple configurations of the resonance diffuser, giving the instrument a wider range of timbre.

A more modern inspiration was Quadrant, a MIDI controller developed by Chris Chronopoulos from 2018 through 2021. This controller uses upward-facing time-of-flight sensors to determine the pose of the player’s hand. Notably, it features a variety of different control modes: using four sensors, it can detect the position, velocity, orientation, and angular velocity of the performer’s hand. It also features special modes which can detect sweeps of the hand across the instrument, or plucking motions made above each of the four sensors. Of course, being a MIDI controller, it can be used to control an incredible variety of digital sound sources.

I was also inspired by the laser harp, another modern MIDI controller played without contact. Again, I appreciated how intuitive the device seemed: by leveraging the existing understanding that people have of traditional harps, the laser harp makes its controls obvious.

With this project, I wanted to create a controller that operated in free space, just like the theremin. I wanted to give a reference to allow for more precise pitch control, like the keyboard provided by the ondes Martenot. I wanted to support various control modes, in a manner similar to Quadrant, and I wanted the device to leverage existing intuitions people have about theremins in the same way the laser harp leverages existing intuitions about harps. Above all, I wanted this controller to be as intuitive and inviting as the theremin felt to me when I was younger.

Technical Description

LiDAR Sensor

For this project, I am using a LiDAR sensor, specifically the Slamtec RPLIDAR A1. This sensor is used to detect the position of the user’s hands in free space. It features an infrared laser diode and a receiver, mounted on a spinning platform.

This sensor is a type of Time-of-Flight (ToF) sensor. It functions by sending out a pulse of infrared laser light and measuring the time taken for the light to bounce off of an object and return to the receiver. Stationary ToF sensors, like those used on Quadrant, produce a stream of distance values:

[...]
0.505452 m
0.501343 m
0.499432 m
0.476832 m
[...]

Each of these values represents the distance from the sensor to the nearest object it sees at a specific instant in time. The range of these sensors varies: the sensors used on Quadrant had a range of about 1 meter, while this sensor can measure up to 12 meters. As there are no external factors that drastically affect the speed of light, these readings are generally quite accurate.

By mounting the ToF sensor onto a rotating platform, it is possible to measure at many different angles and construct a 2D map of the sensor’s surroundings. For this reason, these sensors are often used in robotics, to allow a mobile robot to map its environment, determine its position, and navigate. While LiDAR sensors are used in some autonomous driving applications, they are most commonly used in some high-end robot vacuum cleaners to map and clean a given area. For this MIDI controller, application however, the sensor is kept stationary.

By combining the distance measurements from the laser diode with the angle of the rotating platform at the time of measurement, the sensor constructs a stream of angle/distance pairs.

[...]
30.12° 0.505452 m
31.10° 0.501343 m
32.07° 0.499432 m
32.98° 0.476832 m
[...]

The sensor used for this project produces a stream of about 1,500 such pairs every second and sends this data to the computer over a serial port.

Sensor Control/Processing Program

The sensor data is then passed to the first of two computer programs I wrote for this project. This first program is responsible for processing the sensor data and determining the position of each hand within each zone. If the sensor is rotating quickly enough, we can consider every measurement taken in one rotation as a single snapshot in time. Then, for each angle/distance pair, we can plot a point on a graph at the given angle and distance from the origin. In other words, we convert the pairs of (angle,distance)(\text{angle},\text{distance}) coordinates to pairs of (x,y)(x,y) coordinates and display the result.

Pictured above is the graph that is generated from this process. In this image, the green dots represent points found by the laser, and the purple and orange rings mark distances of 50 and 100 centimeters away from the sensor respectively. This scan was performed in my dorm room. This complete plot of the environment includes the performer’s body, the walls of the room, and any other objects in the environment at the sensor’s height. Next, it is necessary to filter out anything from the environment which should not affect the output, i.e., everything except the hands of the performer. I decided to accomplish this by processing two separate zones in the pot, such that the performer could move each of their hands within its respective zone in two dimensions. This results in four separate axes of control. The use of two separate zones was chosen to maximize the number of axes available to the user, and to eliminate the possibility of one hand obscuring the other.

The user can select these two zones by clicking each of the eight corners on the graph. In the chart above, my left hand is visible in the zone on the left side and my right hand is visible in the zone on the right side. The sensor, represented by the white dot in the center, is located in the middle of the playing surface, between the zones and directly in front of my body. The software will calculate the position of each point in the scene along each of the four axes. To do this, it constructs a projective transformation matrix, which is a technique from linear algebra. The following procedure was adapted from this StackExchange answer by Dr. Martin von Gagern.

The software knows the coordinates of each corner in space, and it knows the position of each corner relative to each axis (the point where the two axes meet is (0,0)(0,0), the opposite corner is (1,1)(1,1), etc). Using this, it will construct a transformation matrix that can map any point in space to its position relative to each axis. Intuitively, the most obvious approach would be to construct a 2×22\times2 matrix to perform this mapping, since we are working with pairs of coordinates. However, since we have four separate known mappings, the matrix would have too many conditions to meet. The system would be overdetermined and thus impossible to solve.

We can add “wiggle room” to the system by using a 3×33\times3 matrix instead. However, this requires representing our coordinates using triples instead of pairs. In other words, this requires some way of representing coordinates in the form (x,y,z)(x,y,z) instead of (x,y)(x,y). One such approach uses homogenous coordinates. Mapping Cartesian coordinates to homogenous coordinates requires adding a 1 in the third position (such that (x,y)(x,y) becomes (x,y,1)(x,y,1)), and mapping homogenous coordinates back into Cartesian coordinates requires dividing the first and second coordinates by the third coordinate (such that (x,y,z)(x,y,z) becomes (xz,yz)(\frac x z, \frac y z )). Notably, this system creates a set of points with homogenous coordinates of the form (x,y,0)(x,y,0) which cannot be mapped to Cartesian coordinates due to the division by zero. Conceptually, these points represent scenarios like the following:

The above point is around 2.52.5 on Axis 0, but it has no meaningful coordinate on Axis 1. As the positions of the performer’s hands should be inside the quadrilateral zone, scenarios like the above should never arise in this application. For each zone, the software will map the four corner points to their homogenous representation, then compute the 3×33\times3 transformation matrix. Then, it will sort the points based on whether the lie inside the zone. It determines if each point is inside the zone by checking if its position on either axis is less than zero or greater than one; although more optimized methods of testing if a point is inside a quadrilateral certainly exist, I found the matrix multiplication to be approach reasonably performant, so I decided against introducing additional complexity. Finally, to create a single value for each axis, the software will average the axis positions of each point located inside the zone. This step produces a stream of four numbers, representing the position of both hands along both axes in their respective zones:

[...]
0.43 0.53 0.21 0.54
0.41 0.57 0.19 0.50
0.42 0.61 0.19 0.52
[...]

This stream of values is then passed to the second program.

MIDI Mapping Program

The second program handles mapping each axis to a stream of MIDI data. Each of the four axes are handled separately. The program provides a control panel which allows configuring the parameters of the MIDI stream.

Each axis can be mapped to either note messages or Continuous Controller (CC) messages. For note messages, the range of notes is configurable. Additionally, pitch bend messages can be sent to smoothly interpolate between notes. As currently implemented, transitioning from one note to another smoothly requires that the sound source envelope not have attack/decay or this attack will be audible when moving between notes. For CC messages, the control number can be chosen. Both modes support setting the channel number, which is useful for having different axes control different synth voices. A setting to invert the signal is also provided.

Surface Configurations

Since the scanning occurs about 2 inches above the tabletop, the user does not need to touch any surface to use the controller. This allows any variety of surfaces to be placed underneath the setup. The simplest configuration has no reference material whatsoever. However, I found that this made it difficult to use, as I would frequently move my hand into or out of the zone without knowing, unexpectedly playing or stopping a tone.

I also tried placing a piece of paper in each zone and calibrating the corners of the zone to the corners of the page. This solved the aforementioned issue. This configuration worked well for controlling a filter or some other aspect of the sound timbre, but I found that an 8.5x11” zone was too small to precisely control pitch. I also found that, without any sort of reference, it was almost impossible for me to precisely reach a specific note. At this time, I tried decreasing the range of the controller from 2 octaves to 1 octave in an attempt to improve precision, without much success. I also tried disabling the pitch bend messages to see if snapping the hand position to the nearest note would help, but I did not notice any improvement in usability.

Next, I tried an asymmetrical layout, with a portrait-oriented paper on the left and two landscape-oriented pages arranged lengthwise on the right. The left page showed a 4x4 grid, and the right pages had vertical lines denoting the boundaries between notes. I found that this made me much more able to hit a specific note within the octave range, even if I re-enabled pitch bends. That said, I did not play along to any other instruments, so I might have overestimated my ability to precisely reach a particular pitch.

I found it difficult not to hit intermediate notes. While moving my hand, even quickly, if a scan saw my hand in the middle of the move, it would play the intermediate note for about 150ms, which sounded jarring at times. To compensate, I played by lifting my hand up away from the zone before moving it side-to-side between notes. This created an additional issue: due to the shape of my hand, moving it up and down could alter the detected position along the forward-and-back axis. I tried playing using a credit card instead of my hand but found it just as easy to ensure my hand was not tilted forward or back when moving it up or down.

Mapping Configurations

The first mapping I tried used the controller to control a wavetable synthesizer in Ableton Live. It used the right hand horizontal axis to control pitch and the right hand vertical axis to control the position on the wavetable. The left hand controlled a low-pass filter, with the horizontal axis controlling the frequency cutoff and the vertical axis controlling the resonance parameter. I found wavetable synthesis to be a natural fit for this controller, since having a single parameter as the primary control of the sound’s timbre could give one hand control over both pitch and timbre, freeing up the other hand for controlling an effect or a second sound source.

My attempt at controlling two synthesizer voices with this project used one hand for each voice, making use of my software’s ability to send MIDI note messages on different channels corresponding to the input axis. I again used the horizontal axis for pitch. One patch was a sawtooth wave bass, with the vertical axis controlling a low-pass filter, and the other patch was a square wave in a higher octave, with the vertical axis controlling a delay effect. I found controlling two separate voices to be overwhelming. This was mostly because I needed to visually look at each hand to position it properly, which was difficult as they were on opposite sides of the playing area. I also found the delay effect control to have little utility outside of “spooky sound effects.” I did enjoy “tuning” the two voices to an interval by ear (since I was not playing with a note reference at the time) and controlling an almost-exact-interval with my hands to slightly tune or detune it produced some interesting sounds, but this was made difficult by the latency issues (discussed in a later section).

Returning to the wavetable and low-pass filter mapping, I tried an alternate playing style by placing a small object in the filter zone to hold those axes at a specific point. This allowed me to focus more on playing melodies with my right hand and less on holding my left in a specific stable position. However, it also removed the “free space” aspect of the controller for that zone, and it took away from the direct coupling of hand-to-sound, reintroducing that hesitance before manipulating a control. Ultimately, this playing style is a tradeoff, diverging from my original vision in pursuit of usability, and I believe having it as an option is a net benefit.

Finally, I tried using the controller in conjunction with a traditional MIDI keyboard. I deactivated the right side zone, placed the keyboard in its place, and kept my left hand in the left side zone. I found that this combination worked surprisingly well: the traditional keyboard allowed me to play notes accurately and with minimal latency, while my left hand was free to alter the timbre of the sound. I found the LiDAR controller felt much more expressive than a simple mod wheel; even though it had similar precision, having a large physical representation of a parameter gave me a better sense of the possibilities it provided, and coupling my hand directly to its output made experimentation feel more direct and natural. While this configuration also does not follow my initial vision for a single, configurable, all-purpose MIDI controller, it succeeded in allowing for intuitive and inviting free-space control of timbre while falling back on a familiar interface for pitch.

Across each of these configurations, I found that mapping two related parameters to a single hand fundamentally changed how I conceptualized the controls. I tried a few mappings of this sort, such as controlling the frequency and resonance of a filter, the time and feedback of a delay module, the amount and wet/dry mix of an overdrive effect, and the rate and feedback of a phaser. In each of these cases, I found I had a much better sense of the timbre possibilities offered by these controls. With a traditional controller, with single knobs mapped to single controls, I would typically move just one knob at a time and miss vast swaths of potential sounds. Having two parameters mapped to 2D space encourages moving diagonally or in a circle to modify both parameters at once. Even though traditional CC controllers allow mapping the same knob to several controls, it is not the typical use case.

One of my favorite moments was finding a particular spot in the 2D zone that caused the filter to resonate particularly strongly or the phaser to fit the melody I was playing perfectly. Exploring these pockets gave me the sense that I was exploring something inherently two-dimensional, instead of just manipulating two parameters at once.

Results

Scan Latency

The issue of scan latency proved difficult to overcome. The LiDAR sensor that I use spins at about 380 RPM when connected to 5V power, leading to an effective latency of around 150 milliseconds between scans. This had a few adverse effects on the usability of the controller.

The first issue with the scan latency was the creation of a 150ms “rhythm” in the audio output. The output would jump sharply to a new pitch whenever scan data was received, so if the user were to attempt a gradual slide between notes, the result would sound much choppier than intended. A potential remedy is to interpolate smoothly between scans, ramping to each value just as the measurement following it is received, but doing this would double the effective latency of the controller. This behavior could also perhaps be used for effect if a piece were specifically written around it, but the sensor used in this project does not provide an easy way to precisely control the speed of the scan motor, so synchronizing it to a track would be difficult. Given the scope of the project, I decided against trying to implement precise motor speed control, although this could be a direction for further experimentation.

The latency also proved to be a barrier for precise control. I had assumed that 150ms, while by no means ideal, would be passable for syncing with the beat of slower music, but I overlooked the impact of latency on other aspects of performance. With a continuous instrument (such as a theremin, or playing a guitar with a slide), the performer will adjust in real-time using their ear until they are playing the correct pitch. With the LiDAR controller, the presence of latency disrupts this real-time adjustment, making it extremely difficult to play in tune. Removing the pitch bend messages and snapping each position to the nearest semitone helped, but it was still difficult to precisely position one’s hand in a specific 1/12 of the zone without any markings for guidance. Ultimately, a visual reference was required to accurately play pitches, and even then, the controller was difficult to use.

Future Directions

One direction I did not explore was integrating graphic scores into the project. While I tried a configuration with a general note reference on the surface, it would also be possible to use a reference designed for a specific piece of music. Such a reference would be, in a sense, musical notation for the piece. A potential difficulty with this is the lack of a “time” axis to draw on, which could make duration or progression difficult to express in this notation.

Another potential direction is to use interpolation to reduce the jarring steps between scans. I intended to implement this, but the way in which I structured my software did not lend itself well to this approach, and a major restructuring would be needed. I am curious as to how adding glide between scans would affect the perceived latency of the controller, and how much it would improve the discontinuity between scans. Notably, different amounts of glide could be explored. For example, interpolating for the full 150ms would yield the smoothest output but the greatest latency, but perhaps a glide of 20ms could mitigate most of the jarring clicks/jumps in the audio with a minimal impact on latency. Additionally, different interpolation functions could be explored.

More direct control over the sensor itself could be explored. The specific model of LiDAR sensor I used lacks an interface to control the motor speed or scan rate. A sensor with more features could allow tweaking these parameters, potentially leading to reduced latency and/or increased precision, especially since LiDAR sensors are typically calibrated to scan an entire room instead of a tabletop surface. The scan rate could even be tuned to match the tempo of the music being played, resulting in a much more predictable playing experience and eliminating the issue of sudden off-beat jumps in output.

Finally, gathering the perspectives of others would be an important next step. Due to the timing of this project ending over spring break, I did not have much opportunity to share this project with others. Of course, when I tested the controller, I brought along my past intuitions and experiences. Other users would be drawing on different prior intuition when using the controller, and understanding its usability from more perspectives could inform how it could be made more intuitive and more inviting.

]]>
https://breq.dev/projects/midilidar /projects/midilidar Fri, 13 May 2022 00:00:00 GMT
<![CDATA[MOTD Necklace]]>

The necklace, in two separate frames. On the left, it shows my radio license in a heart-shaped frame, on the right, it shows a message my friend came up with in a square frame.

Overview

This project is a necklace/amulet which can display a message on an e-Ink display. The persistence of the e-Ink makes it so that the display, once updated, does not require a battery to hold the image.

Motivation

I was inspired by a display pendant idea I found on Adafruit which used an ATTiny85 and an LED matrix display to show an animation. I wanted to experiment with a wearable project, but I didn't want to have to remember to charge it or design it around a large battery.

I was also looking for an excuse to use an e-Ink display for a project, due to their unique design considerations and low power requirements. I decided an e-Ink screen could be a good fit for this project.

Technical Description

The project consists of a monochrome e-Ink display mounted in a 3D printed frame. Pogo pins on a programming board I made are used to connect to the pins on the display breakout board.

The programming board, featuring an M0 board and a row of pogo pins.

I've printed two frames that I can switch between: a smaller, square one and a larger, heart-shaped one.

Results

Even though I didn't put any special attention into weatherproofing, I've worn the necklace outdoors / in the rain / on the beach plenty and I haven't had any issues. This is probably because there is no onboard power: even though there is no special ingress protection, the display is fine as long as it is dry when programmed. Also, the 3D printed frame seems to fit the display tightly enough that dust/dirt/sand exposure doesn't pose much of an issue.

The programming process ended up working pretty well, and it's definitely something I could change every day if I had the motivation to. The pogo pins do require force to push the display against them, and slipping up at the wrong moment can cause the display to be refreshed improperly, causing issues. Some sort of latch to hold the display in place during programming could have remedied this.

Overall, I'm happy with how this project turned out, and the necklace (with clever message) has become a common feature of my daily outfit.

]]>
https://breq.dev/projects/motd /projects/motd Fri, 13 May 2022 00:00:00 GMT
<![CDATA[Artificial Soundscapes]]>

A screenshot of the sonification project in Ableton Live, with data mapped to automation lanes and MIDI notes.

The final sonification, featuring sounds generated from Boston, Los Angeles, and Anchorage data.

Overview

In this project, I attempted to use data sonification techniques to create abstract soundscapes for various cities. Notably, the process of producing sounds from data is as similar as possible between the various cities, to allow the listener to compare and contrast these cities based on each soundscape. I wanted the listener to be able to understand the different aesthetic of each soundscape as a whole, but I also wanted to enable the listener to recognize how specific measurements and data points differed both over time and between cities.

One challenge I had was balancing the aesthetics of the compositions with the need to convey information. Sonification typically uses more abstract sound sources, but these can sound jarring and disjointed. Alternatively, soundscapes often blend layers of sound together, but this using this approach could impede the actual presentation of data. As I wanted to create a piece which could serve both purposes, I needed to find a compromise between these opposing goals.

To solve this, I decided to focus solely on facilitating comparing and contrasting of data. Thus, I could tweak the processing of data to achieve a more pleasant sound, and as long as I applied these tweaks equally to each city, it would not impact the listener’s ability to compare between cities. An example of this sort of tweak is mapping data to notes of the major scale instead of arbitrary pitches: this removes distracting dissonance and yields a more musical result, but it does further decouple the audio result from the data it is based on.

Motivation

I drew some inspiration from soundscape compositions, which framed naturally occurring sounds in a more narrative way. Instead of sounds recorded from nature with a microphone, however, I worked with sounds generated from natural data. I focused on natural data partly as an homage to soundscape compositions. I only used data from the span of one year, but a future project could examine multiple decades to show a more dynamic view of the climate.

One of the primary criticisms of sonification as a field is bias in the data processing: when creating an algorithm to convert data into sound, people are likely to have some preconceived notion of what the result should sound like. (For instance, sonification of space-themed data often uses whooshing noises, not because they aid in interpreting the data, but because science fiction has taught us that is what space should sound like.) The mapping from data to sound should leverage existing intuition if possible to help listeners understand the audio, but it should not detract from the data itself. These decisions have to be arbitrary, making this a difficult problem to solve. I drew heavily on Tantacrul's critique of sonification during this project to understand this balance.

One of the most important facets of this project is that the same processing is applied to each city. My hope is that this reduces the likelihood that the processing step I have developed is biased towards telling a specific story or focusing on a particular theme. Without this restriction in place, I might allow my own understanding of cities to influence the results of the sonification instead of letting it be driven by the data.

Technical Description

I sourced weather data for this project from NOAA’s free Climate Data Online service. This service gives daily temperature, wind, and precipitation data for a large number of stations both in the US and globally. Most US cities had multiple stations to choose from, and I specifically chose stations from the largest nearby airport (as these stations often report more data). I chose to analyze data over the course of one year, since it was a short enough interval for individual days to be represented as notes while being long enough to demonstrate the periodic nature of the seasons. I looped this data to emphasize this repetition. I also sourced tidal data from NOAA’s Tides and Currents service.

While I wanted to try to incorporate additional sources (such as public transit data, air traffic data, or highway traffic data) into the soundscape, I could not find a practical way to accomplish this. Although many cities have transit APIs, these lack consistency, and they generally only provide data about the current location of trains and the predicted future schedule (whereas this project is more focused on historical data). Most air traffic APIs are similarly focused on present and future data only, and their rate limits would not allow me to retrieve a year’s worth of data at once for this project. Finally, highway traffic is generally not precisely measured on a real-time basis.

To process this data, I first used a Jupyter notebook and Python code. This program would read the data from each source, then map each data point to one or more MIDI messages. These messages were organized first by city, then by data source, then by “tick” / instant in time. Messages could be grouped together to occur at the same instant, and a list of “cleanup” messages was kept ensuring no notes were left playing if the audio was stopped early. I arbitrarily chose that a tick would represent 1/20 of a second, as it was fast enough for the data to sound continuous but slow enough for the ear to pick out individual data points.

CITIES = ["LAX", "ANC", "BOS"]
@dataclass
class Tick:
messages: list[mido.Message]
cleanup: list[mido.Message]
TRACKS: dict[str, dict[str, list[Tick]]] = {
city: {
"weather": [],
"tides": [],
} for city in CITIES
}

Additional Python code could then play back these tracks simultaneously over different MIDI channels. An Ableton Live set was configured to receive MIDI input, send separate channels to separate tracks with separate instruments, and map MIDI CC messages to instrument parameters to allow them to be controlled by the incoming data. Finally, the resulting audio was recorded using Ableton Live. I chose this workflow because I had familiarity with these tools, having used Ableton Live for production and Jupyter notebooks for analysis (albeit not simultaneously). Although I experimented with using Max/MSP for a potentially cleaner and more extensible implementation, the timeline for the project made a familiar workflow more pragmatic.

I decided to select cities with different climates in order to highlight their differences. I started with Boston, as I have familiarity with its climate. I considered New York, but I decided against it due to its proximity to Boston and similar weather. Next, I chose Los Angeles due to its famously stable climate, followed by Anchorage for its unique geographic location.

While creating the sonification, I noticed that it was difficult to effectively use pitch to represent multiple measurements. Generally, pitch is an appealing quantity to map to, since it has such a wide range and is easy for the ear to recognize. However, mapping multiple quantities to different pitches simultaneously raised some problems. When multiple pitches are being played simultaneously, the listener can end up focusing on the interval instead of each separate note. This detracts from the interpretation of data since whether the interval is major or minor typically does not represent anything. This issue can be remedied somewhat by using distinct sound sources or by playing in separate registers, but I still decided to only map one measurement to pitch.

Another tradeoff I made related to the continuous nature of the phenomena being sonified and the discrete nature of the measurements taken. I found that directly mapping measurements to synthesizer controls could produce a result which sounded discontinuous and disjointed, and I considered implementing some form of interpolation to make the audio result sound smoother. Ultimately, I decided against this, as it could potentially lead to a misleading presentation of the data. This decision, however, did make the audio result less pleasant to listen to, and it compromised on the soundscape aesthetic which I was trying to achieve.

After making these tradeoffs, I eventually decided that trying to create a meaningful and useful sonification which was also an artistic piece following the aesthetic of a soundscape would not be a feasible endeavor. The aim of sonification is to present data in a scientific sense for the ear to recognize new patterns, not to create a piece of music. Massaging the data and representation to fit a specific predetermined aesthetic is fundamentally at odds with this goal. To be true to the sonification of data, I decided to focus less on the soundscape aspect of the project.

This also highlighted a flaw in my earlier assumption that the comparative nature of this project could help eliminate bias. Although the sonification would not be biased based on one specific city’s impact on popular culture, it could still be biased based on my preconceptions about what aesthetic cities should have, the cultural significance of the weather, and other factors unrelated to the data itself. Additionally, through chasing a particular aesthetic, the resulting sonification could lose information, making it less effective at triggering new insights.

I decided to use two sound sources: a wavetable synthesizer to represent the weather data, and white noise to represent the tidal data. The pitch of the wavetable was controlled by the maximum temperature on a given day. I used the Ableton “Basic Shapes” preset and mapped the wavetable parameter to the wind speed such that calm days were represented by sine or triangle waves and windy days were represented by sawtooth waves. Finally, I used a low-pass filter on the white noise controlled by the water level to represent the tides.

I used linear equations to map specific data values to notes or parameter values, attempting to show as much of the usable range as possible. However, while tuning these equations, I focused most on the Boston data. When I tried these equations with the Anchorage data, I found that the temperature dipped below the valid range of note values and the tide data was typically beyond the possible extremes of the control. The tide data was still understandable, but the temperature graph was inaudible during winter.

With a visual graph, I would simply make the graph taller to include all of the relevant information. With sonification, however, I have to work within a limited range of notes. I considered adjusting the mapping to raise all notes up, but that could have caused issues with representing the Los Angeles summer (or any warmer cities, for that matter). I also considered making the mapping denser to fit both low and high temperatures, but that would have made the Los Angeles data harder to understand (as there was little variation from winter to summer anyway, and I wanted to not flatten this any further). I also could have shifted just the Anchorage data up. While that could have helped the listener understand more about the climate of Anchorage, it would have interfered with their ability to compare the data with the other cities. Eventually, I decided to leave the equations in place, allowing some of the Anchorage data to be missing from the audio result.

Results

Overall, the sonification I produced has some utility, although it did not accomplish the goals I started with. While developing the data processing pipeline, I found that I needed to let go of my expectations for the aesthetic of the result if I wanted the data to be presented in a meaningful way. I also found it more difficult than expected to develop a mapping from data to sound which leveraged the existing intuition of the listener without being biased towards a particular narrative and overshadowing the data. Listening to the end result, I can certainly hear and recognize specific differences between the cities: the stable climate of Los Angeles, the harsh winds of Boston, and the extreme tides of Anchorage.

]]>
https://breq.dev/projects/soundscapes /projects/soundscapes Fri, 13 May 2022 00:00:00 GMT
<![CDATA[Wordle Clones]]>

Overview

Wordle is a popular word guessing game, kind of like Mastermind but for letters. I've written a few clones of the game in different languages.

Motivation

The TypeScript clone was born out of a desire to understand the letter coloring procedure. I had played Wordle before, and I wanted to see if I could implement it myself.

The Rust clone was written because I wanted to learn more about Rust. I figured Wordle was a complex enough game that implementing it cleanly would require a decent understanding of Rust features and best practices.

Technical Description

TypeScript

Code is available at breqdev/wordle.

I built this project in React, but I wanted to ensure the game logic was sufficiently decoupled from the rendered result. I wrote this logic in two pure TypeScript functions, rowColoring and keyboardColoring.

The row coloring function assigns colors to each letter in the word. I took advantage of the type system to define letters in the target as explicitly nullable, allowing them to be "removed" when matched by a letter in the guess.

export function rowColoring(guess: string, target: string) {
// Store the color alongside each guess letter
let guessLetters: LetterGuess[] = guess.split("").map((letter) => ({
letter,
color: "gray",
}));
// Store the target in an array of nullables
let targetLetters: (string | null)[] = [...target];
// First pass: match green letters
guessLetters = guessLetters.map(({ letter, color }, index) => {
// green letters are matched by the specific index in the target
if (letter === targetLetters[index]) {
// remove matching green letters from the pool
// so that they aren't also matched as yellows
targetLetters[index] = null;
return { letter, color: "green" };
} else {
return { letter, color };
}
});
// Second pass: greedily match yellow letters
guessLetters = guessLetters.map(({ letter, color }) => {
if (color === "green") {
// don't modify existing green letters
return { letter, color };
}
// yellow letters are matched by searching the entire target word
else if (color === "gray" && targetLetters.includes(letter)) {
// remove yellow letters once matched,
// each letter only matches once
targetLetters[targetLetters.indexOf(letter)] = null;
return { letter, color: "yellow" };
} else {
return { letter, color };
}
});
return guessLetters;
}

In Wordle, the keyboard serves an important role: it shows how much information you have gotten about a letter based on your guesses. Dark gray signifies that the letter does not appear (it was not colored in a previous attempt), yellow signifies that it does appear (it was colored yellow in a previous attempt), and green signifies that you have correctly guessed the position at least once (it was colored green in a previous attempt). As the coloring of each letter relies on the coloring of previous attempts, the keyboard coloring function makes use of the row coloring function to color each of the guesses.

export function keyboardColoring(guesses: string[], target: string) {
const letters: Record<string, LetterGuess> = {};
for (const letter of "abcdefghijklmnopqrstuvwxyz") {
letters[letter] = { letter, color: "gray" };
}
for (const guess of guesses) {
const coloring = rowColoring(guess, target);
for (const { letter, color } of coloring) {
if (letters[letter].color === "gray" && color === "gray") {
letters[letter].color = "black";
}
if (letters[letter].color === "gray" || color === "green") {
letters[letter].color = color;
}
}
}
return letters;
}

Rust

Code is available at breqdev/rust_wordle.

I wanted to rely on as many zero-cost abstractions as possible. For storing each row and each word, instead of Vecs allocated on the heap, I decided to use fixed-length arrays with type aliases:

type Word = [char; 5];
#[derive(Copy, Clone, Eq, PartialEq, Hash, Debug)]
struct Square {
color: Color,
letter: char,
}
type Row = [Square; 5];

I used a Trait to implement printing the row:

trait PrintWordle {
fn print_wordle(&self);
}
impl PrintWordle for Row {
fn print_wordle(&self) {
// ...
for square in self.iter() {
let mut boxed = "│ ".to_owned();
boxed.push_str(&square.letter.to_string());
boxed.push_str(" │");
print_colored(&square.color, &boxed);
print!(" ");
}
println!("");
// ...
}
}
// ...
row.print_wordle();

I tried to make use of a functional style for the scoring algorithm, relying on iterators for most of the heavy lifting:

fn score_guess(target: &Word, guess: &Word) -> Row {
// Map each letter of the target to an Option, so we can "remove" it later
let mut remaining = target.map(|c| Some(c));
// All tiles start off white
let mut result = guess.map(|letter| Square {
color: Color::White,
letter,
});
// Use `.enumerate()` to check for the right tile in the right index
for (i, square) in result.iter_mut().enumerate() {
if target[i] == guess[i] {
square.color = Color::Green;
remaining[i] = None;
}
}
// Greedily take remaining unmatched target letters to turn guess letters yellow
for (i, square) in result.iter_mut().enumerate() {
if square.color == Color::White {
if let Some(pos) = remaining.iter().position(|&c| c == Some(guess[i])) {
square.color = Color::Yellow;
remaining[pos] = None;
}
}
}
// Any unmatched squares become gray
for square in result.iter_mut() {
if square.color == Color::White {
square.color = Color::Gray;
}
}
result
}

I tried to keep this all straightforward, but I still wasn't too confident that I had nailed all of the edge cases. I was delighted by Rust's testing support:

#[cfg(test)]
mod tests {
use super::*;
fn expect_score(target: &str, guess: &str, colors: Vec<Color>) {
// ...
}
#[test]
fn correct_guess() {
expect_score("ARRAY", "ARRAY", vec![Color::Green; 5]);
}
// ...
}

Using cargo was also a welcome relief from fighting with C++ and git submodules. I used rand to pick a random target word, colored to print colored squares to the terminal, and serde_json to read the wordlist files.

Results

The TypeScript implementation works well, and it's actually my preferred Wordle to use due to its simple design, infinite puzzles, and the fact that it lets me keep playing after 6 wrong guesses. In hindsight, some memoization could have improved the performance of my declarative approach, as recoloring every row on every render undoubtedly has a performance penalty. That said, it would have been a tradeoff, and I don't think it's necessary given the relatively small number of guesses being used.

The Rust implementation is undoubtedly a bit less usable, being a CLI app, but I learned a lot about using constructs within Rust. While TypeScript had given me some intuition for how type aliases and type inference work, and C++ had given me a basic understanding of stack and heap memory, concepts such as Traits and the borrow checker were completely new to me. This wasn't a huge project, but the variety of data structures and paradigms it involved gave me a decent birds-eye view of Rust as a language.

]]>
https://breq.dev/projects/wordle /projects/wordle Fri, 13 May 2022 00:00:00 GMT
<![CDATA[Dynamic Music]]>

A screenshot of the environment.

Overview

This project is a virtual environment containing several sound sources, represented as spheres. The listener, represented with a cone, can navigate around the environment to hear different combinations of the sources. Additionally, they can move the sources around within the environment.

Motivation

I was inspired by the way it feels to work with music in a DAW: almost like exploring some sort of space. I decided to create a virtual space which replicated that feeling, allowing anyone to play with the mixing of a song. I also made this out of a desire to work with Three.js.

Technical Description

The app is written in vanilla JS using Vite for build tooling. Each audio track was a wav file exported from an Ableton Live set. I decided to take an object-oriented approach to the code layout, representing each sound object with an ES5 class.

Results

Aside from a, uh, learning moment causing some issues with the audio panning, most of the project was straightforward. In the end, I think I succeeded at creating the environment I set out to create, although a bit more variety in the sound sources (perhaps multiple sections of the piece which could be alternated between?) might have helped the experience not bore the listener as quickly.

]]>
https://breq.dev/projects/dynamic-music /projects/dynamic-music Fri, 13 May 2022 00:00:00 GMT
<![CDATA[ATtiny85 Stacker Game]]>

A quick demonstration of the game.

Overview

This is a standalone game device. The game is a simple stacking game, requiring the user to press the button at the specific time to align each layer. The button is implemented by sensing skin conductivity.

Motivation

I wanted to try my hand at a low-power embedded project. In the past, I had done projects which ran off of battery power, but I wanted to try to make something that could be left with the batteries in for years and still function. This required working with the sleep mode on my microcontroller of choice, which was something I hadn't used before.

Technical Description

The microcontroller is an ATtiny85, and it's connected to an I2C LED matrix with an HT16K33 chip.

The techniques for putting the microcontroller in sleep mode are based on this tutorial from Adafruit.

The code uses functions from <avr/sleep.h> and <avr/power.h>, documented on the avr-libc site: sleep.h, power.h. (Documentation for the set_sleep_mode macro is notably absent, but the source is here.) The Wandering Engineer also published a more detailed description of the registers used (PCMSK and GIMSK). The MCUCR (MCU Control Register) is documented here.

Here's the meat of the implementation:

void setup() {
// ...
// Disable unused peripherals (Timer1 and ADC)
power_timer1_disable();
power_adc_disable();
// Enable interrupt on PB4 pin change
// (Set the Pin Change Interrupt 4 bit in the Pin Change Mask register)
PCMSK |= _BV(PCINT4);
// ...
}
void awaitButton() {
// Send command 0x20 to the I2C display
// (system setup / turn off system oscillator / standby mode)
TinyWireM.beginTransmission(0x70);
TinyWireM.write(0x20);
TinyWireM.endTransmission();
// Enable Pin Change Interrupts
// (Set the Pin Change Interrupt Enable bit in the General Interrupt Mask register)
GIMSK = _BV(PCIE);
// Disable all modules
power_all_disable();
// Enter power down mode
// Set the sleep mode 1 bit (SM1) in the MCU control register (MCUCR)
set_sleep_mode(SLEEP_MODE_PWR_DOWN);
// Enable sleep mode
// Set the sleep enable bit (SE) in the MCU control register (MCUCR)
sleep_enable();
// Enable interrupts globally
// (Set the global interrupt flag, I, in the status register, SREG)
sei();
// Enter sleep mode
sleep_mode();
// Clear the General Interrupt Mask register
GIMSK = 0;
// Enable Timer 0 and the USI (Serial) peripherals
power_timer0_enable();
power_usi_enable();
// Re-initialize the display
TinyWireM.begin();
initDisplay(0);
}

Results

I've left it with the batteries in for more than a year by now, and I haven't noticed a drop in the display brightness. I'd consider it a success, although the rationale behind some of the hardware decisions has definitely been lost to time... Why didn't I just use an actual button? Regardless, this project certainly helped me get closer to understanding AVRs at a low level.

]]>
https://breq.dev/projects/tiny-stacker /projects/tiny-stacker Fri, 13 May 2022 00:00:00 GMT
<![CDATA[Nuisance]]>

The Nuisance dashboard, in light mode.

Nuisance is a dashboard I made collecting links to Northeastern University student portals and online services.

Motivation

Northeastern has a few different collections of links: myNortheastern, which was disorganized and shut down, and the Student Hub, which is, frankly, filled with irrelevant information and loading spinners. I wanted an unobtrusive portal that would load quickly and link to the services I found most useful.

Technical Description

The page is built using React and Create-React-App, and styled with Tailwind. It doesn't have any notable features other than a dark mode option and a setting to choose between opening links in the current tab or a new tab. Both of these settings are persisted in localStorage. In hindsight, React was almost certainly not necessary, but I had experience with a component-based project structure and wanted to iterate quickly.

I picked Cloudflare Pages for hosting, to make sure the site loaded fast.

Results

This was a lot more successful than I had thought it would be. I shared it with a few friends, and it's accumulated a fair number of users across Northeastern. In hindsight, it isn't that surprising to think that since I had a problem, others probably did too.

I didn't collect a ton of user feedback: I added GitHub links for suggestions, but only a month or so after I initially shared it. This might have led to broader adoption. They didn't take much effort to add, so in future projects, I'll make an effort to add them initially instead of waiting for popularity to come.

I also didn't have any sort of analytics configured (since I had meant for this project to just be for myself), which made it hard to measure how much traffic the dashboard was getting. In hindsight, I guess it doesn't really matter what the numbers say, though. This project made my day-to-day life a bit more smooth, and it helped out a few friends, too.

]]>
https://breq.dev/projects/nuisance /projects/nuisance Thu, 24 Mar 2022 00:00:00 GMT
<![CDATA[BotBuilder]]>

A couple demo commands built using Blockly.

Overview

BotBuilder is an online tool that allows people to build custom Discord commands by dragging blocks. These commands are then added as slash commands to the user's guild ).

Motivation

After building the flask-discord-interactions library, I realized how Discord's Interactions API could enable interesting custom commands with less overhead than a traditional Gateway-based bot. Many of my friends wanted to create their own Discord bot to include custom commands, but running a Discord bot typically requires registering as a developer, finding hosting, handling tokens, installing a library, writing code, and other tasks that might prove difficult for someone inexperienced with programming. I wanted to create a service that would allow users to add custom commands to their Discord servers without any prior knowledge of bot development or code.

Technical Description

Users log in with their Discord account using OAuth2 to access their workspace. There, they are able to use Blockly to create commands.

When users edit their workspace, the workspace is converted to XML and uploaded to the BotBuilder server, and if they close and reopen the workspace, their previous workspace is automatically downloaded and restored. At each upload, each command is also compiled into JavaScript, and each JavaScript function is uploaded to the BotBuilder server.

Initially, users must click the "Add To Server" button, which will direct them through Discord's OAuth2 flow to authorize BotBuilder to add commands to their guild. After this, when new commands are uploaded, the BotBuilder server calls the Discord API to update the slash commands present in each of the user's guilds, adding any new commands and removing any deleted ones. It also maintains which set of users push commands to which guilds in a Redis database.

When a command is executed in one of these Discord servers, Discord will send a POST request to BotBuilder. After verifying the cryptographic signature of this request, BotBuilder will look up which users added commands to the guild, and look through these commands for one that matches the incoming request.

After identifying the command, BotBuilder will execute the JavaScript code and return the result as a response to the POST request. To execute JavaScript, I'm using PyMiniRacer, a tool developed by Sqreen to run and interact with Google's V8 JavaScript engine from Python.

I chose to use JavaScript as the compilation target for the commands for security reasons. Executing, say, Python code in a sandbox would be difficult to do securely. JavaScript, on the other hand, is designed to be run in the browser, which is an inherently sandboxed environment.

Results

Not many of my friends ended up using this. I think the main reason was beacuse not enough features were present, and most people who wanted to make a custom command would need more functionality. Some of this was a limitation of Discord--without a bot user, the Interactions API is much more limited, and it doesn't allow applications to look up user information, modify roles, or do other actions that most bots typically do. However, supporting additional features like custom command arguments and HTTP request blocks would have maybe turned this into a more useful service.

]]>
https://breq.dev/projects/botbuilder /projects/botbuilder Wed, 23 Mar 2022 00:00:00 GMT
<![CDATA[Flask Discord Interactions]]>

Some commands for Breqbot Lite, a bot I made with this library.

Overview

Recently, Discord introduced a new Slash Commands feature that allows bots to integrate using webhooks. This is a library that handles registering the commands, receiving interactions, sending responses, sending followup messages, and including message components like clickable buttons in your message. It's written as a Flask extension, so you can add other pages to the app and handle scaling/serving like any other Flask app.

Motivation

Most Discord bots and libraries use a Bot user to connect to the Discord API. Bot users interact with Discord in a similar way to actual Discord users: they connect over a WebSocket and then send and receive events such as messages. This approach works well for basic bots, but it makes it difficult to scale. Alternatively, webhook-based bots can be deployed behind a load balancer and scaled up or down as needed without worrying about overloading the websocket or allocating different servers to different processes.

That said, the webhook approach is significantly more limited. Webhook bots can't manage channels, reactions, direct messages, roles, or most of the other features in Discord. However, for basic bots that don't need these features, webhook bots can be easier to develop and deploy.

Technical Description

The library is designed to be similar to the popular Discord.py library. It's probably better to show than to tell:

import os
from flask import Flask
from flask_discord_interactions import DiscordInteractions
app = Flask(__name__)
discord = DiscordInteractions(app)
app.config["DISCORD_CLIENT_ID"] = os.environ["DISCORD_CLIENT_ID"]
app.config["DISCORD_PUBLIC_KEY"] = os.environ["DISCORD_PUBLIC_KEY"]
app.config["DISCORD_CLIENT_SECRET"] = os.environ["DISCORD_CLIENT_SECRET"]
@discord.command()
def ping(ctx):
"Respond with a friendly 'pong'!"
return "Pong!"
discord.set_route("/interactions")
discord.update_slash_commands(guild_id=os.environ["TESTING_GUILD"])
if __name__ == '__main__':
app.run()

The discord.command() decorator creates a SlashCommand and adds it to the application, and the discord.set_route() function adds an HTTP route that handles interaction events. The library will automatically register the commands and remove old ones on launch. When it receives an interaction from Discord, it will verify the signature, parse the command options, run the command, and return the result.

Results

This was one of the first OAuth2 projects I made, which was cool. It works well enough for my basic testing bot. Overall, I'm pretty proud of this one: I saw a gap where a library didn't exist, and I developed something to fill it.

I'm also really glad to see that a small community has sprung up around this library. I listed it in the official Discord Developer Documentation, among the numerous other Slash Command libraries. So far, four other people have contributed code to the project through pull requests, and 22 issues have been filed in the issue tracker. It's been an interesting experience to receive a bug report or feature request from a community member and then figure out how to prioritize it and how best to patch it.

]]>
https://breq.dev/projects/flask-discord-interactions /projects/flask-discord-interactions Wed, 23 Mar 2022 00:00:00 GMT
<![CDATA[3D Printer Light Tower]]>

NeoPixels never look good on camera.

Overview

This is a small indicator light strip that I mounted next to my 3D printer. It indicates if the printer is ready, printing, or in an error state.

Motivation

In June 2021, I started a summer job in the Texas Instruments semiconductor fab in South Portland. There were a lot of different machines in the fab, but they all had one standardized light tower design. This helped me learn how to operate them more quickly, transfer my knowledge between different areas, and understand the status of a machine at a glance.

I noticed that the indicators shown for a fab machine (load, unload, error) were similar to those I might want to show for a 3D printer. So, I set about building my own "light tower" to attach to my 3D printer.

Technical Description

I used Adafruit's NeoPixel library to write a Python script that changes the status displayed on the light tower. However, this script needs to run as root, since Adafruit's library uses DMA (direct memory access) to control the Pi's PWM module.

I then wrote an Octoprint plugin that handles printer events. I used the EventHandlerPlugin mixin to write a basic Python script that called the NeoPixel script when necessary.

Here is when an issue arose: Octoprint runs with a user account, but the NeoPixel script needs to run as root. I needed some way to allow the Octoprint user account to execute a program as the root user.

I decided to use the Unix "setuid" system to allow the Octoprint user to invoke the NeoPixel script with the permissions of the root user. Since setuid can't be used for scripts, I wrote a wrapper function to pass along the arguments and run the NeoPixel script as root.

#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>
int main(int argc, char* argv[]) {
setuid(0);
argv[0] = "./tower.py";
execv("./tower.py", argv);
return 0;
}

With this in place, the chain of Octoprint -> Plugin -> Setuid Wrapper -> NeoPixel Script -> NeoPixels worked perfectly.

Results

Generally, none of my projects have required an understanding of Linux permissions and libc functions like setuid and exec. A lot of my knowledge in this area was built up by watching LiveOverflow. It was really fun to finally actually put some of this knowledge to use, and writing a setuid program myself helped me better understand some of the security issues involved and the other specifics of this system.

]]>
https://breq.dev/projects/light-tower /projects/light-tower Wed, 23 Mar 2022 00:00:00 GMT
<![CDATA[Links]]>

The sign-in screen.

The dashboard for the URL shortener.

Exactly what it says on the tin: a basic URL shortening service that allows changing the destination of the URL after creating it.

Motivation

I had done some server-side-rendering work with Flask before, but I hadn't ever approached it from the JavaScript side of things. I wanted to understand some of the frameworks that are used to perform SSR using Node.

At the time I made this, the college application season wasn't that far behind me, and I remember seeing mail from some colleges using a plus sign from their domain as a URL shortener, such as https://wpi.edu/+FJI3DE. This seemed like a cool idea, since it wouldn't interfere with existing routes but would provide short URLs on a recognizable and trusted domain.

Technical Description

URLs are stored in Redis, since a key-value store seemed like a great fit for this scenario. Routing is handled using koa. For requests to the redirect URLs, this is all that happens.

For the login and dashboard pages, I used templates written in nunjucks, a JavaScript templating language similar to Python's jinja2 (used with Flask).

Authentication was handled with JWTs as cookies, and I used the GitHub OAuth API instead of storing usernames and passwords. I didn't use a client-side JavaScript framework, just plain old HTML forms and a bit of vanilla JS to handle some show/hide buttons.

Results

This project made me shift my thinking a lot. In many regards, I'd been spoiled by front-end frameworks. Working without one, I had to think about how I could use the platform to accomplish my goals. This in turn gave me a better understanding of concepts like JWTs, cookies, and HTML forms.

It was cool to use platform features like forms and cookies to handle data submission and authentication without client side JS. Contrast this with flowspace, in which I just passed the token as an Authorization header for every fetch POST request I made. This was a pretty simple project, but I'm glad I took it on.

]]>
https://breq.dev/projects/links /projects/links Wed, 23 Mar 2022 00:00:00 GMT
<![CDATA[Picto]]>

This is a clone of Pictochat built on top of Web technologies.

Motivation

I wish we never met
We broke up on PictoChat, crying on my DS
I went to a birthday party for one of her friends
And now that this is over I can hate them, I don't have to pretend

- Glitch Gum, "NEVER MET!"

I grew up with a Nintendo DS, so it's no surprise I have a ton of nostalgia for Pictochat. And when a certain hyperpop song rekindled that nostalgia, I wanted to find a way to experience Pictochat again.

Technical Description

The project is mostly just a React single-page-application (although there's a small WebSocket server component here to rebroadcast messages).

I didn't strip any assets from Pictochat itself, and I'm not much of a sound or icon designer, so I made do with what I could find. I picked similar sounds from material.io and icons from Font Awesome. Some of them are a better match than others, but overall, they match up pretty well.

I made an effort to have usable keyboard navigation. The original DS allowed using either the stylus or control pad for navigating the interface, so I wanted this project to have a similar experience.

I also tweaked the onboarding flow a bit. In the original Pictochat, the name and theme color of the user was read from the DS system settings. Since I'm only cloning Pictochat itself, I instead prompted for these during onboarding. I also tweaked the chatroom mechanics somewhat. The original Pictochat used the Nintendo Low Latency Protocol to create chatrooms with nearby DS handhelds within range over 2.4GHz. While a similar system would be fun to implement on dedicated Linux boxes (IBSS + batman-adv, anyone?), I certanly couldn't implement it in a Web browser, so chatrooms are global.

One of my favorite aspects of Pictochat was the closed-ness of the rooms: even though you might be in a room with unfamiliar people (e.g. at an event), you could be certain it was a relatively small group that you could get to know. Having four global chatrooms seemed counter to this, so I insted opted to let chatroom names be any arbitrary string. My hope was that people would choose mostly-unique names, keeping the number of users per chatroom relatively low.

Building this as a React app and styling it with Tailwind was mostly straightforward. It was a bit difficult to get the viewport string correct such that the window wouldn't become narrower on mobile -- for once, we don't actually want the site to be responsive, since it has to exactly match the layout of the original Pictochat! Also, keeping the state of the canvas in the message compose box was tough, since the state of the image couldn't be extracted from the canvas element itself. As a workaround, I passed a ref object down to the canvas, which assigned it to a dispatch function. Then, the parent component could dispatch commands down into the child and request state to flow up from it. It's ugly, but I can't think of a better way of doing things: I can't exactly change how the platform works, and mixing multiple data models never works well.

Results

It isn't perfect, but it's accurate enough that I was able to relive some of my childhood: typing a bunch of text and scribbling furiously with the pen tool to make a message completely black, copying and editing a message to write all over it, and dragging letters all over the page as decoration.

The WebSocket connection had some reliability issues, and I think my reconnection logic might have been broken somehow. Other than that, I was pretty happy with how everything turned out. It didn't blow up, but it helped me and my friends scratch that nostalgic itch, which was nice.

]]>
https://breq.dev/projects/picto /projects/picto Wed, 23 Mar 2022 00:00:00 GMT
<![CDATA[Cards]]>

Overview

This is a service to generate custom "cards" based on a defined template and user-supplied fields.

Motivation

While working on Breqbot, I wanted to replicate the "rank card" idea provided by bots like MEE6, but with user-supplied information and images instead.

I started by writing a program using PIL that would take in a user's name, bio, and profile image, and generate a basic PNG. I was frustrated by the process and the end result. I had to manually implement things I had taken for granted in the world of web-dev, such as text wrapping and emoji support. The process of implementing and modifying the card templates was time-consuming and tedious. Additionally, when I tried to include these rudimentary images on Breqbot's website, I needed to redo the entire layout in HTML and CSS.

I had the idea of creating a standalone service to generate these cards based on a predefined template and output them to either an IFrame or an image file. The resulting output could be used anywhere: sent as a Discord message, included in a GitHub README, or embedded in a website.

Technical Description

The service will render an HTML template with the user-provided parameters. Then, if an image file is requested, it will use pyppeteer to take a screenshot of the HTML template using Chrome.

It's also possible to "freeze" a card, preserving its screenshot on the server and returning a permanent link/URL to the card. This avoids having to use pyppeteer for every request for the card. To generate the card IDs, I'm using another service I made, Snowflake.

Results

I wasn't originally a huge fan of using a headless browser in the server-side, as it seemed like it would be a waste of resources and using the wrong tool for the job, but the service ended up working pretty well, although the time-to-first-byte is, predictably, pretty poor compared to the other projects I've made.

Performance aside, the process of developing new cards is much easier now. At the time of writing, I've pretty much only dipped my toes into web development, but I was able to make a few templates pretty quickly that looked much better than the old PIL tool ones.

While integrating this service with Breqbot and analyzing how it could be used, I noticed that most use cases will generate a card once and then embed it repeatedly. For instance, users will request each others' cards on Breqbot more often than they will update their own, and cards put in e.g. GitHub profiles are typically created once and left as-is for a while. As such, it's kind of wasteful to regenerate the card for every request, so I implemented the "Freezing" functionality. This was a cool experience: deploying a project, seeing how it was used, and then adding functionality where it was lacking.

]]>
https://breq.dev/projects/cards /projects/cards Wed, 23 Mar 2022 00:00:00 GMT
<![CDATA[Mini-ITX Computer Case]]>

The finished product had about the same footprint as my laptop.

In late 2018/early 2019, I decided to build a desktop computer for use doing 3D rendering and experimenting with machine learning.

Unfortunately, I didn't document my work nearly as well as I should have, so this writeup is going to be a bit heavy on the renders and light on the photos.

Overview

This was a custom-built PC case that houses a mini-ITX motherboard, SFX power supply, and small-form-factor GPU. It was built out of a mix of wooden parts and 3D-printed parts that I designed.

Motivation

Around this time, I had started taking 3D rendering at MSSM, and I was disappointed with the poor rendering performance of my laptop. I decided that I wanted to build a more powerful computer that would better handle tasks like 3D modeling, rendering, and deep learning. In order to get the best sustained performance at my price point, I decided to go with a desktop instead of a higher-end laptop or a NUC.

However, I wasn't a big fan of how large most desktop computers were. Most PC cases seemed extremely large, which wasn't convenient for me at all, since I had to go back and forth to MSSM every month at the time. While form factors like ITX (motherboards) and SFX (power supplies) had started to take off, even cases built for them seemed excessively large.

I decided to build my own case, with a goal of making something no larger than a few vertically-stacked binders. I wanted it to be able to fit in my backpack or suitcase to make transportation easy.

Technical Description

The case consists of three wooden panels (the top, bottom, and front) that all screw into the 3D-printed interior structure. Because my 3D printer has only 120mm of build plate space, the interior is split into 6 separate sections which are held together by their mutual attachment to the wooden panels.

The case contains two distinct airflow zones--one for the GPU and one for the CPU. The GPU slots into the left side of the case, connecting to the motherboard using a riser. The PSU sits directly underneath it. Additionyally, there is space for an SSD alongside the fan.

On the left side, a hard drive sits underneath the motherboard, and there is space for an additional drawer for storing USB cables or other small parts. Above these, the motherboard sits on standoffs, and an SSD can attach to the front. This area has its own fan which blows air mostly above the motherboard, but some goes towards the hard drive as well.

In the far corner of the render, you can see a small keyhole shape. This is a place where power button, status LED, or other types of modules can be inserted. I only ever made one such module, a simple on/off switch.

Results

I used this case for over a year, and it held up pretty well. By the end of its life, the power button had broken, and the screws were starting to strip inside the 3D-printed plastic. However, structurally, it held up well, even after getting roughly tossed into a suitcase many times.

After I came back home from MSSM, I transferred my rig back over to an off-the-shelf case, since I had plenty of space for it. That said, I certainly wouldn't call the project a failure--it was a great case for the 2 years or so that I used it, and I learned a lot about 3D printing in the process. Specifically, this project made me really consider the structural qualities of a 3D print, and I had to choose the orientation of each piece carefully in order to ensure each one was strong. While I used up plenty of filament in the process, by the end, this was something that I'm really proud of!

]]>
https://breq.dev/projects/itx-case /projects/itx-case Wed, 23 Mar 2022 00:00:00 GMT
<![CDATA[MakerGamer]]>

MakerGamer running on a PocketCHIP

Overview

MakerGamer was a virtual video game console designed to make game development as accessible as possible and to encourage the sharing of games. It could play games written in Scratch, Python, and JavaScript, and it provided basic tools to modify the games people downloaded. I wrote it with the intent of running it on Next Thing Co.'s PocketCHIP.

Motivation

This project was inspired by Pico-8, a “fantasy console” that allowed people to write retro-feeling games for modern hardware. However, the learning curve for using Pico-8 was high, and I thought it would be cool to create something in the style of Pico-8 but that would be suitable for people learning to code for the first time. I set out to create a similar “fantasy console” that could play games written in Scratch but felt like a traditional video game console.

Technical Description

I designed and programmed an interface for downloading Scratch projects by the project ID, adding them to a virtual library, and playing them on demand, using Phosphorous to run then without Adobe Flash. Python games written using PyGame were executed with parameters set to make PyGame full screen. Web-based games were downloaded and opened in a web browser. The user interface (main menu, project editors, etc) and the code editor were written in Python using PyGame.

Results

It worked well enough for me to demo it to people at the STEM fair. But it didn't really have that much actual use, as all the games could be played in a web browser anyway. I eventually scrapped plans to finish the project-editing functionality, as it wasn't practical to write a code editor, image editor, and etc. that only ran on a tiny screen anyway.

]]>
https://breq.dev/projects/makergamer /projects/makergamer Wed, 23 Mar 2022 00:00:00 GMT
<![CDATA[R2D2 Clone]]>

If you're wondering, the chassis is an old Play-Doh bucket.

Overview

Exactly what it says on the tin: a remote-controlled robot built to look like R2D2. As time went on, I did add some more advanced features, such as text-to-speech and video streaming.

Motivation

This was made for several occasions: the STEM fair, and Halloween later that year when I went trick-or-treating as Luke Skywalker.

Technical Description

I started with the guts of an earlier robot I had built: two Lego NXT motors, connected to a Raspberry Pi through an H-Bridge motor driver. I spent a lot of time adding features to this robot: it could be controlled using a Wii remote, a smartphone app, or an interactive website. It had speech synthesis software and an internal speaker, a camera mounted to allow video streaming, and a few LEDs for blinkenlights.

By the end of the project, I had switched to using an NXT brain to control the motors, connected over USB to the Raspberry Pi.

Results

It did its job and did it well. This project was one of the first significant software projects I worked on, and it taught me how to design and organize a program longer than just a few lines.

]]>
https://breq.dev/projects/r2d2 /projects/r2d2 Wed, 23 Mar 2022 00:00:00 GMT
<![CDATA[Rave Choker / Outshine]]>

The latest revision of the choker, displaying a multi-color fade/wipe animation.

The first revision of the choker, displaying a simple back-and-forth animation.

This project consists of three parts:

  • outshine, an Arduino program that displays animations on a NeoPixel strand
  • OutshineApp, an Android app to communicate with an Arduino running Outshine via USB serial
  • the rave choker, the physical device that I built which runs Outshine firmware

Motivation

Outshine

Outshine itself was based on some work I did for NU Rover, Northeastern University's team which competes in the Mars Society's University Rover Challenge. On the rover, LEDs are controlled by an ATmega328PB separate from the primary STM32 microcontroller. As such, to implement the NeoPixel handling, I essentially had free reign over this chip.

Functionally, the rover is required to display different colors to indicate its control mode (teleoperated or autonomous) and status about its autonomous navigation. It does not need to show animations, but I included them anyway because I figured they could be useful for displaying more detailed status information.

Rave Choker

I got a ticket to a Rezz show, and I wanted to incorporate some flashy LEDs in my outfit somehow. Rezz is known for her LED glasses which display a spiral pattern. I wanted to create something similarly colorful and animated, but not a direct reimplementation of the glasses.

The process started with me standing in front of a mirror and holding a lit LED strip up to different parts of my body. Eventually, I settled on a choker, because I liked the look of it the best.

Originally, I had intended for the choker to be controlled via Wi-Fi by broadcasting its own network and exposing a server with a web UI. I had some familiarity with this architecture, as it's similar to how I built the wall matrix project.

At the time, I had recently bought a limited-edition pink RP2040 Feather board from Adafruit, and I really wanted to use it for a project. This board doesn't have Wi-Fi or Bluetooth, so I figured I would add an additional Pi Zero to handle the networking.

With the added bulk of another board in mind, I decided to explore putting the boards and battery off of my neck and into a hat.

This worked to some extent, but the stretchiness of the hat didn't pair well with the fragile ribbon of the LED strip. In the end, I ditched the idea as I thought it seemed too fragile for extended use.

OutshineApp

Going back to the drawing board, I decided to try leaving the Feather board on my neck and directly attaching it to my phone over USB. The phone would interface with the board over serial and provide power to it (removing the need for a dedicated battery).

I tried to use the WebSerial API for this, but was hit with a really stupid issue involving Chrome on Android:

As such, I switched to a native app. I built it in React Native, because I already had some familiarity with React.

Redesign

Tragically, this first revision of the choker was left slightly trampled after I went a little too hard in a food house moshpit. I was able to quickly repair the broken solder joints, and the LED strip was barely damaged (somehow only the red channel of the last pixel was broken), but I still realized that I needed to rethink the design.

I had the following goals in mind for the redesign:

  • Remove the need for a tether between the phone and the choker (as this could get dangerous in a crowd, or yanking it could cause the choker to come off of me)
  • Make the clasp at the back of the choker more sturdy
  • Remove the silicone outer layer from the LEDs

Technical Description

Outshine

The Outshine firmware takes in commands over I2C, UART, or Bluetooth. Initially, commands were four bytes long:

[1 byte] Red channel
[1 byte] Green channel
[1 byte] Blue channel
[1 byte] Animation command

Later, I reworked the protocol to use a two-byte indicator before each byte, to prevent the ends of the protocol from getting out of sync:

  • 01: Start Transaction, remaining 6 bits are the animation ID
  • 10: Color data, next 6 bits are a portion of an RGB color
    • Color data is shifted in by this command, so any existing color data will be shifted left by 6 bits
  • 11: Final color data, next 6 bits are color data and this is the end of an RGB color
    • This can be followed by either additional colors (10) or the end of the transaction (00)
  • 00: End Transaction

I2C support was added for Rover, UART support was added for debugging the Rover but was later used for the rave choker, and Bluetooth support was added for the V2 choker.

The driver relies on one of two libraries, depending on the architecture:

The code makes use of a table of function pointers to control the active animation. The animation state (e.g. current active pixel in a wipe animation, current frame in a fade animation) is persisted across a shared struct to make transitions between animations more smooth.

OutshineApp

The app is built in React Native, using react-native-serialport to handle the USB serial communication and react-native-ble-plx to handle Bluetooth. I'm just using off-the-shelf components for the color wheel, buttons, and brightness slider. I didn't even bother to build a production version; I just keep the development .apk on my phone.

The V2 version of the app switched to simple color swatches and added support for multi-color configurations (complete with some pride-themed presets). This was much easier for me to manipulate quickly.

Rave Choker

The V1 rave choker itself consisted of a strand of NeoPixels connected to an Adafruit Pink RP2040 Feather. I used a "banana connector" as a clasp. It was wired to ground on both sides, so it wasn't electrically functional.

The body of the choker was housed in a 3D-printed case, made from two parts that screw together using threaded inserts. The case has an open top to show off the pink RP2040, because I thought it looked cool.

The V2 rave choker also uses a 3D-printed frame, but it completely encloses the board. This frame also includes a battery, allowing for operation without a tether. I continued to use a Feather, but I opted for the nRF52840 variant for Bluetooth support.

I used fake leather material and metal snaps to provide the structure of the choker, and I attached the NeoPixels to the material using E6000 adhesive, loosely following this guide for a harness bra from Adafruit. I again blacked out the copper contacts on the front of the LED strip, but this time, I used black nail polish instead of eyeliner. The result vaguely resembles a studded look, and the choker doesn't look out of place even with the lights turned off.

Results

It worked well at the Rezz show! In fact, a cute transfem noticed it, and we ended up talking and dancing together for most of the night :)

I had initially been worried about the battery life being an issue, but as long as the LEDs aren't solidly on full-white, it doesn't seem to be a huge problem.

I was also a bit worried about the "crowd-safety" of it--what if someone yanks on the cord?--but it came unplugged easily without yanking on my neck.

I also wore the choker to GAY BASH'D, a combination DJ set and drag show. I made a few revisions to the setup prior to this event, adding a brightness slider to OutshineApp and the 3D-printed case to the choker itself.

After getting the Bluetooth version up and running, I also wore the choker to Carpenter Brut and SVDDEN DEATH shows, both of which had much more intense moshing than shows I had been to in the past. That said, the choker held up well! It seems to have some level of beer-spill-resistance, which is essential.

Overall, I think I really achieved the goal I started with. The choker is something flashy, stylish, and uniquely "me," and it stands out in the perfect way among a crowd at an EDM show. Also, the Outshine project will undoubtedly prove a useful starting point for future experimentation with NeoPixels.

Starting off with a tethered solution wasn't ideal (for instance, it made going through security a bit of a hassle). Even though the tether wasn't too much of a hassle at more chill shows, it still was much more convenient to have a Bluetooth-based design.

It's also a bit cumbersome to change the animation, requiring me to unlock my phone and open the app. I only really wanted to change it every few songs, but it was still frustrating to have to stop raving to fiddle with a smartphone app. Physical controls would help remedy this, at the cost of more complexity. At the end of the day, I think the tradeoffs I made paid off.

]]>
https://breq.dev/projects/outshine /projects/outshine Sun, 13 Feb 2022 00:00:00 GMT
<![CDATA[WorkerSocket]]>
npm i workersocket

Overview

This is a library I made to run a WebSocket inside of a Web Worker in the browser. It exports a WorkerSocket object which behaves as closely as possible to a browser WebSocket.

Motivation

I wrote this while implementing my fork of roslib, a library for communicating with a ros server through the rosbridge protocol. roslib is typically used to build a web-based monitoring UI for some type of robot.

The original roslib hijacked a browserify library called webworkify to run a WebSocket through a Web Worker, but webworkify doesn't bundle with Vite or Webpack 5.

The reason roslib defaults to putting a WebSocket inside of a Web Worker is to make sure that data is pulled away from the server as quickly as possible, even if it ends up building up in the web worker. Due to an oversight in the rosbridge server, if a client can't pull information quickly enough, then the server will use excessive resources caching data. This pull request for the original roslib describes the issue in more detail. Generally, the software on the robot is more important than the monitoring UI, so offloading the queueing to a background thread in the UI client makes sense.

Technical Description

The library consists of two parts: the web worker itself, and the WorkerSocket implementation.

The WorkerSocket class mimics the API of a WebSocket, but forwards messages to the web worker. The web worker then handles the actual socket. The WorkerSocket class maintains an array of listeners for each event and calls each one as necessary. It also generates a unique ID for each WorkerSocket class instantiated, which is then sent along with all messages to the web worker. The worker maintains a mapping from IDs to socket instances.

The web worker is constructed using an object URL. This technique allows the worker to be bundled with any bundler without any special handling. It is somewhat unintuitive, though, using Function.toString() to turn the worker implementation into a string, then placing it into an immediately-invoked function expression using string manipulation. This is then turned into an object URL from which the worker is loaded.

const workerURL = URL.createObjectURL(
new Blob(["(" + workerImpl.toString() + ")();"], {
type: "text/javascript",
})
);
const worker = new Worker(workerURL);

Testing is performed using Chai in the browser (and using Puppeteer to run headlessly).

Results

It seems to work well. That said, there are so many edge cases with the behavior of WebSockets in the browser that I'm a bit hesitant to use it. I've written plenty of tests to cover common use cases, and I don't know what I would add, but I also don't feel like the test suite is entirely complete.

I've published the result to npm though, in case others want to make use of this.

]]>
https://breq.dev/projects/workersocket /projects/workersocket Wed, 09 Feb 2022 00:00:00 GMT
<![CDATA[Universal Hooks]]>

If you've studied discrete math, you might be familiar with the concept of a "universal gate." You might have also heard that the NAND gate and NOR gate are each universal gates, since you can create any other gate by composing many NANDs or NORs.

React Hooks compose just like logic gates. If you use multiple hooks, they shouldn't interfere with each other, just as multiple logic gates in a circuit can operate independently. Any collection of Hooks can be abstracted away into a new custom hook, just as a collection of logic gates within a circuit can be abstracted into a single unit.

While we're not going to find a truly universal hook, we will show how a few basic hooks can be used to implement most of the others. We're also not going to aim for complete API compatibility -- each of our hooks will do its best to cover a few common use cases.

This post is intended to be interesting reading for those with some experience with React, and to demonstrate some of the principles of hook composition. Thus, some familiarity with React is assumed.

Our building blocks

The React docs list three "Basic Hooks." Let's step through and analyze what each of these do.

useState

This hook will keep track of some value between renders. It also provides a function to set the value. Calling this function will trigger a re-render with the updated value.

useEffect

This hook will run a function after the first render, before unmounting the component, and whenever one of the provided values changes.

useContext

This hook uses the Context API to deeply pass values down the component tree. Since none of the other built-in hooks use the Context API, we won't need to use this.

Implementations

I've taken the liberty to rearrange this list a bit from the docs, since we'll use some of the earlier hooks to build some of the later ones.

useRef

A Ref can be thought of as a "box" that holds a value. Refs are simply objects of the form:

{
current: [...]
}

This allows you to update the value of the ref, and any code with a reference to it will see the updated value. For instance, consider this code:

let count = 0;
const makePrinter = (count) => () => console.log(count);
const printCount = makePrinter(count);
printCount(); // outputs 0
count += 1;
printCount(); // outputs 0

This is admittedly quite contrived. However, in React, this comes up much more often. If you aren't careful, your side effects, event callbacks, and other functions could close over an old value, and they won't get updated when state changes.

Let's see how a ref can fix this code:

const count = { current: 0 }; // or React.createRef(0);
const makePrinter = (count) => () => console.log(count.current);
const printCount = makePrinter(count);
printCount(); // outputs 0
count.current += 1;
printCount(); // outputs 1

Now's a good time to point out the const. In JavaScript, "constant" is not the same as "immutable." The ref is the same object, it's just mutated.

With this in mind, let's build our hook. We can use useState to keep some state around. However, we don't want to actually set the state at all -- refs don't trigger renders. Our hook can take in some default value, create an object with .current, and use useState to persist it across renders.

const useRef = (defaultValue) => {
const [ref, setRef] = React.useState({ current: defaultValue });
return ref;
};

useRef can also take in a function to create the default value on demand. Thankfully, useState allows this too. We can always call useState with a function argument, and in this function, we can test if the default value provided is itself a function, calling it if necessary.

const useRef = (defaultValue) => {
const [ref, setRef] = React.useState(() => {
if (typeof defaultValue === "function") {
return { current: defaultValue() };
} else {
return { current: defaultValue };
}
});
return ref;
};

Finally, let's finish with an example:

const RefExample = () => {
const messageRef = useRef(null);
React.useEffect(() => {
const interval = setInterval(() => {
if (messageRef.current.style.backgroundColor === "black") {
messageRef.current.style.backgroundColor = "white";
} else {
messageRef.current.style.backgroundColor = "black";
}
}, 500);
return () => clearInterval(interval);
}, [messageRef]);
return <div ref={messageRef}>Hello World!</div>;
};
const App = () => <RefExample />;
ReactDOM.render(<App />, document.getElementById("root"));

useReducer

This hook can be thought of primarily as an alternative to useState. A reducer is a function that takes in a state and an action and returns an updated state.

const initial = { count: 0 };
const reducer = (state, action) => {
if (action.type === "increment") {
return { count: state.count + 1 };
} else if (action.type === "decrement") {
return { count: state.count - 1 };
} else if (action.type === "reset") {
return { count: 0 };
} else {
throw new Error("Unsupported reducer action");
}
};

We can use a reducer without React as follows:

let state = { ...initial };
console.log(state); // { count: 0 }
state = reducer(state, { type: "increment" });
console.log(state); // { count: 1 }
state = reducer(state, { type: "increment" });
state = reducer(state, { type: "increment" });
state = reducer(state, { type: "decrement" });
console.log(state); // { count: 2 }
state = reducer(state, { type: "reset" });
console.log(state); // { count: 0 }
reducer(state, { type: "double" }); // Error: Unsupported reducer action

All right, let's look back at React. Here's how you might use useReducer:

const ReducerExample = () => {
const [state, dispatch] = React.useReducer(reducer, initial);
return (
<div>
<span>count is {state.count}</span>
<button onClick={() => dispatch({ type: "increment" })}>increment</button>
<button onClick={() => dispatch({ type: "decrement" })}>decrement</button>
<button onClick={() => dispatch({ type: "reset" })}>reset</button>
</div>
);
};
const App = () => <ReducerExample />;
ReactDOM.render(<App />, document.getElementById("root"));

The useReducer hook returns two values: state and dispatch. The state object is, unsurprisingly, the current state. The dispatch function is a function that calls the reducer with the current state to set the new state. Notably, you don't have to provide the current state to dispatch.

With this in mind, we can start scaffolding the signature of our version of this hook.

const useReducer = (reducer, initial) => {
// ...
return [null, () => {}];
};

We'll need to keep track of the current state somewhere. Let's use the useState hook for this. We'll also need to define some kind of dispatch function.

const useReducer = (reducer, initial) => {
const [state, setState] = React.useState(initial);
const dispatch = React.useCallback(
(action) => {
// do something with the current state
},
[state]
);
return [state, dispatch];
};

Finally, remember that our reducer has signature reducer(state, action). In our dispatch function, we can call our reducer with the current state and provided action, then set the new state to the returned value.

const useReducer = (reducer, initial) => {
const [state, setState] = React.useState(initial);
const dispatch = React.useCallback(
(action) => {
setState(reducer(state, action));
},
[state]
);
return [state, dispatch];
};

One more thing: React's useReducer allows passing in an initializer function as a third argument. We can match this behavior in our hook by passing a function to setState.

const useReducer = (reducer, initial, initFunction) => {
const [state, setState] = React.useState(() => {
if (initFunction) {
return initFunction(initial);
} else {
return initial;
}
});
const dispatch = React.useCallback(
(action) => {
setState(reducer(state, action));
},
[state]
);
return [state, dispatch];
};

And use it like this:

useReducer(reducer, 0, (count) => ({ count }));

useMemo

Memoization refers to the practice of making a function "remember" its past inputs and output, so that it doesn't need to execute again if its inputs don't change.

Consider a function like:

const expensive = (x) => {
console.log("executing very expensive computation...");
return Math.exp(x);
};

We can make a new, memoized version that "remembers" its past inputs and output. To associate these inputs and outputs, we can use an immediately-invoked function expression.

const memoized = (() => {
let previousInput = null;
let previousOutput = null;
const expensive = (x) => {
console.log("executing very expensive computation...");
return Math.exp(x);
};
return (x) => {
if (x === previousInput) {
return previousOutput;
} else {
previousInput = x;
previousOutput = expensive(x);
return previousOutput;
}
};
})();

If we try this, we can see that the function is only evaluated when the input changes:

console.log("Not memoized");
console.log(expensive(0)); // executing very expensive computation..., 1
console.log(expensive(0)); // executing very expensive computation..., 1
console.log(expensive(0)); // executing very expensive computation..., 1
console.log(expensive(2)); // executing very expensive computation..., 7.3891
console.log("Memoized");
console.log(memoized(0)); // executing very expensive computation..., 1
console.log(memoized(0)); // 1
console.log(memoized(0)); // 1
console.log(memoized(2)); // executing very expensive computation..., 7.3891

This is the building block we'll need to use for our memoized function.

React's useMemo is a bit different, though. It assumes that the function closes over the values it needs, then uses an extra "dependency array" to determine when changes are necessary. Again, let's start with its signature.

const useMemo = (func, deps) => {
return func();
};

func is the function that we want to memoize, and deps is the array of dependencies. This "works," but doesn't memoize anything.

Since our hook will be called on each render, we'll need a mechanism to keep the old value around. Remember, the whole point of this is to not re-invoke things on every render. We can useRef to keep things around as a ref.

const useMemo = (func, deps) => {
const previousDeps = useRef(deps);
const previousValue = useRef(func);
return func();
};

Here, useRef(deps) returns a ref initialized to the dependency list. useRef(func) takes advantage of the special behavior of useRef when passed a function—it initializes the ref to the output of the function.

Let's compare the previous dependencies with the current dependencies. Object.is() and Array.prototype.every() are both helpful here.

const useMemo = (func, deps) => {
const previousDeps = useRef(deps);
const previousValue = useRef(func);
const matches = deps.every((dep, i) =>
Object.is(dep, previousDeps.current[i])
);
if (!matches) {
previousValue.current = func();
}
return previousValue.current;
};

Let's finish this one off with an example.

const MemoExample = () => {
const [x, setX] = React.useState(0);
const [y, setY] = React.useState(0);
const exp = useMemo(() => {
console.log("Very expensive operation...");
return Math.exp(x);
}, [x]);
return (
<div>
<span>
x is {x}, y is {y}, exponent is {exp}
</span>
<button onClick={() => setX(x + 1)}>increment x</button>
<button onClick={() => setY(y + 1)}>increment y</button>
</div>
);
};

You should see Very expensive operation... logged to the console whenever x changes, but not when y changes. exp should always be kept in sync with x, but the computation is only done when x changes.

useCallback

This hook is a variant of useMemo, but it serves a different use case. useMemo is primarily used to prevent extra computation. However, we can also use it to prevent the "identity" of a value from changing.

Conceptually, the output of useMemo changes only when the dependency array changes. This means, if the value is passed down to other components, those components will only re-render when the dependency array changes.

Here's an example (albeit a very contrived one) of this use:

const CallbackExample = () => {
const [x, setX] = React.useState(0);
const [y, setY] = React.useState(0);
const squareX = useMemo(
() => () => {
setX(x * x);
},
[x]
);
return (
<div>
<span>
x is {x}, y is {y}, exponent is {exp}
</span>
<button onClick={() => setX(x + 1)}>increment x</button>
<button onClick={() => setY(y + 1)}>increment y</button>
<ExpensiveComponent squareX={squareX} />
</div>
);
};

This is mostly useful if the function goes into the dependency array of another hook somewhere else.

In this case, there's nothing expensive going on that needs memoizing--the function isn't actually executed anyway. But syntax like () => () => ... is difficult to read and understand. Thus, we can make a version of useMemo that takes in a value directly, instead of a function that returns it.

const useCallback = (callback, deps) => {
return useMemo(() => callback, deps);
};

Conclusion

In this post, we've walked through re-implementing a few React hooks by composing existing hooks and adding some logic. With just React.useState, we've implemented useRef, useReducer, useMemo, and useCallback.

Is this practical? Not really. But I hope it gives a sense for what React hooks are "made of," and gives a few examples on how hook composition can work.

]]>
https://breq.dev/2022/01/19/hooks /2022/01/19/hooks Wed, 19 Jan 2022 00:00:00 GMT
<![CDATA[remark-abcjs]]>
Sheet Music for "Nokia Tune"Nokia Tune

Overview

remark-abcjs is a Remark plugin to render sheet music written in ABC notation.

Motivation

I wanted to learn more about the Unified ecosystem by writing a plugin for it, and this seemed like an interesting challenge. I also figured I might end up using it on my site if I ever get around to posting music-related content.

Technical Description

The plugin looks for nodes in the syntax tree with type code and language abc. This means you can write ABC notation as:

```abc
X: 1
T: Nokia Tune
M: 3/4
L: 1/8
K: Amaj
| e'd' f2 g2 | c'b d2 e2 | ba c2 e2 | a6 |
```

It then uses ABCJS to render the music to an SVG, storing the result in the data property of the node so that remark-rehype can render it as HTML.

Results

Works well enough that I'm using it successfully on my site. That said, compromises had to be made. ABCJS doesn't support Node.js environments out of the box, so I had to use patch-package to manually patch it, and then use a build script to include the patched version. My patch uses JSDOM to create a document object for DOM manipulation.

Also, this project showed me, to put it frankly, how broken the ES module rollout has become. The UnifiedJS collective has more or less entirely switched to pure ESM packages, which can't be require()'d. On the other hand, Gatsby is still purely CommonJS. As a result, any Gatsby site has to pin an old version of remark/rehype and friends. I was primarily developing this plugin for my own site, but I wanted to support the latest standards, so I used Babel to transpile the ES module source to CommonJS. This added complexity to the build process, and I had to pin the CommonJS dual mode versions of all the UnifiedJS packages I depended on. This ended up being kind of the worst of both worlds.

Overall, though, I'm happy I took on this project. I ended up with something useful and I learned a lot about the inner workings of remark and the rest of the Unified ecosystem.

]]>
https://breq.dev/projects/remark-abcjs /projects/remark-abcjs Tue, 04 Jan 2022 00:00:00 GMT
<![CDATA[Building a text processing pipeline with Unified]]>

Unified is a set of software packages designed to work with text data. Many projects including Gatsby uses it to render markdown. In this post, I'll walk through setting up a processor using Unified. We'll start by just processing .txt files, but by the end, we'll have a working compiler from Markdown to HTML. We'll also write several of our own plugins for Unified.

I'm going to assume some basic familiarity with JavaScript and NPM, but my hope is that those new to Node or modern JavaScript will still be able to follow along. That said, the topics will gradually get more difficult as the post continues.

Setup

Unified is written as JavaScript modules intended to be run with NodeJS. Start by setting up a new npm package.

mkdir unified-example
cd unified-example
npm init -y

This will generate a package.json file for you.

The Unified ecosystem uses ECMAScript modules exclusively. However, Node defaults to using CommonJS modules. We will need to modify the package.json file to enable modules. Open up this folder in your text editor of choice, and add a "type": "module" declaration at the end:

{
[...],
"type": "module"
}

Next, we can install Unified itself.

npm install unified

Let's create an index.js file for our code.

import { unified } from "unified";
const processor = unified();

And finally, add an npm run build command to our package.json. Modify the "scripts" section to add the following.

"scripts": {
"build": "node ./index.js",
...
}

We can now run our pipeline with npm run build.

And... nothing happened! That's okay: We aren't feeding anything to our processor yet.

Input and Output

Let's configure our processor to read from a src directory, and write to a dist directory.

mkdir src
mkdir dist

Now, we need some way to run multiple files through our pipeline. We can use unified-engine for this. The engine will select all files from our source paths, run them through the processor, and output them to the destination path.

npm install unified-engine

And finally, let's use it in our code.

import { unified } from "unified";
import { engine } from "unified-engine";
const processor = unified();
await new Promise((resolve, reject) => {
try {
engine(
{
processor,
files: ["./src/**/*.txt"],
output: "./dist",
},
resolve
);
} catch (error) {
reject(error);
}
});

unified-engine will call a callback function when it is finished processing all files. Unfortunately, it doesn't support await-ing the result directly. So, we use await new Promise to wait for the callback. If you aren't familiar with Promises, you can think of this as "waiting for the callback to be called" instead of writing a separate callback function.

Now, if we run this... still nothing. We don't have any files to process. We can make an index.txt file in the src directory, and run it again:

echo "Hello world" > src/index.txt
npm run build
./src/index.txt
1:1 error TypeError: Cannot `parse` without `Parser`
at assertParser (file:///Users/breq/code/unified-example/node_modules/unified/lib/index.js:507:11)
at Function.parse (file:///Users/breq/code/unified-example/node_modules/unified/lib/index.js:265:5)
at parse (file:///Users/breq/code/unified-example/node_modules/unified-engine/lib/file-pipeline/parse.js:50:36)
at wrapped (file:///Users/breq/code/unified-example/node_modules/trough/index.js:111:16)
at next (file:///Users/breq/code/unified-example/node_modules/trough/index.js:62:23)
at done (file:///Users/breq/code/unified-example/node_modules/trough/index.js:145:7)
at file:///Users/breq/code/unified-example/node_modules/unified-engine/lib/file-pipeline/configure.js:76:5
at file:///Users/breq/code/unified-example/node_modules/unified-engine/lib/configuration.js:138:11
✖ 1 error

The pipeline is trying to process our input file, but it doesn't have any parser configured.

Parsers

Parsers are what Unified uses to convert an input file into a syntax tree. They exist for plaintext (.txt files), Markdown, and HTML.

In general, packages to work with Unified are split into three groups: remark for handling markdown, rehype for handling HTML, and retext for handling plain text.

For our example, we're reading in a .txt file. We can use a retext plugin to convert it to a syntax tree. Currently, retext plugins are available for English and Dutch, plus a catchall retext-latin plugin for languages that use Latin-based scripts.

Let's assume we're going to work exclusively with English.

npm install retext-english

And now, we can add our parser to our pipeline.

import { unified } from "unified";
import { engine } from "unified-engine";
import retextEnglish from "retext-english";
const processor = unified().use(retextEnglish);
await new Promise((resolve, reject) => {
// ...
});

Give it another npm run build and...

./src/index.txt
1:1 error TypeError: Cannot `stringify` without `Compiler`
at assertCompiler (file:///Users/breq/code/unified-example/node_modules/unified/lib/index.js:520:11)
at Function.stringify (file:///Users/breq/code/unified-example/node_modules/unified/lib/index.js:281:5)
at stringify (file:///Users/breq/code/unified-example/node_modules/unified-engine/lib/file-pipeline/stringify.js:59:31)
at wrapped (file:///Users/breq/code/unified-example/node_modules/trough/index.js:111:16)
at next (file:///Users/breq/code/unified-example/node_modules/trough/index.js:62:23)
at Object.run (file:///Users/breq/code/unified-example/node_modules/trough/index.js:33:5)
at run (file:///Users/breq/code/unified-example/node_modules/unified-engine/lib/file-pipeline/index.js:57:10)
at wrapped (file:///Users/breq/code/unified-example/node_modules/trough/index.js:111:16)
at next (file:///Users/breq/code/unified-example/node_modules/trough/index.js:62:23)
at done (file:///Users/breq/code/unified-example/node_modules/trough/index.js:145:7)
✖ 1 error

Our pipeline is processing our file, but it can't stringify and save the result. This is where we need a compiler.

Compilers

Compilers are what Unified uses to convert a syntax tree back into a file. Just like with parsers, they exist for all sorts of markup languages. For now, let's keep things simple and output the result as a .txt file.

Again, the retext ecosystem will help us. We can use retext-stringify as our compiler to output another .txt file.

npm install retext-stringify
import { unified } from "unified";
import { engine } from "unified-engine";
import retextEnglish from "retext-english";
import retextStringify from "retext-stringify";
const processor = unified().use(retextEnglish).use(retextStringify);
await new Promise((resolve, reject) => {
...
});

Finally, our pipeline runs! We now have a dist/index.txt file containing our "Hello world" text.

...so what was the point of this? Right now, it seems like all we have is a complicated way to copy files between directories. But the intermediate syntax tree is where the magic happens—we can perform all sorts of processing steps on our text.

Syntax Trees

Before diving into what syntax trees let us do, let's take a look at what one looks like.

Syntax trees in Unified follow the unist specification. This spec defines nodes, which can be either parent nodes (which contain other nodes) or literal nodes (which contain some specific value).

The unist-util-inspect package is a useful tool for inspecting unist syntax trees. Let's add it to our pipeline.

npm install unist-util-inspect

Using this library is a bit tricky right now. unist-util-inspect isn't aware of any of the Unified tooling we have—it's just a function that takes in a syntax tree. We need to hook into the pipeline somehow.

To do this, we need to write our own plugin.

Making a Plugin

In the Unified ecosystem, a plugin is a function that takes in some options and returns another function. The returned function is then called on the syntax tree.

Let's write a plugin called inspectPlugin that logs the syntax tree to the console.

// ...
import { inspect } from "unist-util-inspect";
function inspectPlugin(options = {}) {
return (tree, file) => {
console.log(inspect(tree));
};
}
const processor = unified()
.use(retextEnglish)
.use(retextStringify)
.use(inspectPlugin);
await new Promise((resolve, reject) => {
// ...
});

Not too bad, right? Writing our own plugin only took 5 lines of code. Now, if we run our pipeline again, we should see:

RootNode[2] (1:1-2:1, 0-12)
├─0 ParagraphNode[1] (1:1-1:12, 0-11)
│ └─0 SentenceNode[3] (1:1-1:12, 0-11)
│ ├─0 WordNode[1] (1:1-1:6, 0-5)
│ │ └─0 TextNode "Hello" (1:1-1:6, 0-5)
│ ├─1 WhiteSpaceNode " " (1:6-1:7, 5-6)
│ └─2 WordNode[1] (1:7-1:12, 6-11)
│ └─0 TextNode "world" (1:7-1:12, 6-11)
└─1 WhiteSpaceNode "\n" (1:12-2:1, 11-12)

This is the syntax tree that our pipeline built. Specifically, this is the tree that retext-english created, and it's what retext-stringify used to compile our output file.

Text Processing

Spell Checking

There are plenty of retext plugins that can work with text. Let's start by adding spell checking to our pipeline using retext-spell. We also need to install a dictionary package: let's use dictionary-en.

npm install retext-spell dictionary-en

Now add

import retextSpell from "retext-spell";
import dictionary from "dictionary-en";

to the imports, and add

const processor = unified()
// ...
.use(retextSpell, { dictionary });

to the processor. You might notice that we're passing in an object to .use. These are the configuration options for the plugin. Most plugins take in optional options of some sort, but in this case, retext-spell requires the dictionary option.

Run our pipeline again, and nothing extra should happen. Let's misspell some words and run it again!

echo "Ehllo world" > src/index.txt
npm run build
1:1-1:6 warning `Ehllo` is misspelt; did you mean `Hello`? ehllo retext-spell
⚠ 1 warning

And there you have it: spell checking!

Other ReText Plugins

We can pull in more plugins, too. But first, let's get a bit more source text. I'm going to use a snippet of one of my project writeups:

The STM32 microcontroller this project used doesn't have any purpose-built
hardware for generating sounds (that I'm aware of). So, the solution I
settled on was to manually generate a square wave by setting a GPIO pin
high, waiting for half the length of the waveform, setting it low, and
waiting for the rest of the waveform.
The biggest hurdle with this approach was accurate timing. The STM32 can
use interrupts to delay for a precise number of milliseconds, but
generating square waves at specific frequencies requires sub-millisecond
precision. The solution I came up with was to calibrate a busy-wait loop
when the code begins using the millisecond timer, then use that busy-wait
loop for sub-millisecond-precision delays. This yielded a decent-sounding
square wave, but the game audio still felt incomplete.

We should also probably stop logging the entire syntax tree to the console. Comment out the console.log for now in our custom plugin.

Let's install some more prose plugins. I'm going to throw pretty much the entire suite of plugins into our pipeline.

npm install retext-contractions retext-diacritics retext-equality retext-indefinite-article retext-profanities retext-repeated-words retext-smartypants retext-quotes

Our full code should now look like:

import { unified } from "unified";
import { engine } from "unified-engine";
import retextEnglish from "retext-english";
import retextStringify from "retext-stringify";
import { inspect } from "unist-util-inspect";
import retextSpell from "retext-spell";
import dictionary from "dictionary-en";
import retextContractions from "retext-contractions";
import retextDiacritics from "retext-diacritics";
import retextEquality from "retext-equality";
import retextIndefiniteArticle from "retext-indefinite-article";
import retextProfanities from "retext-profanities";
import retextRepeatedWords from "retext-repeated-words";
import retextSmartypants from "retext-smartypants";
import retextQuotes from "retext-quotes";
function inspectPlugin(options = {}) {
return (tree, file) => {
// console.log(inspect(tree));
};
}
const processor = unified()
.use(retextEnglish)
.use(retextStringify)
.use(inspectPlugin)
.use(retextSpell, { dictionary })
.use(retextContractions)
.use(retextDiacritics)
.use(retextEquality)
.use(retextIndefiniteArticle)
.use(retextProfanities)
.use(retextRepeatedWords)
.use(retextSmartypants)
.use(retextQuotes);
await new Promise((resolve, reject) => {
try {
engine(
{
processor,
files: ["./src/**/*.txt"],
output: "./dist",
},
resolve
);
} catch (error) {
reject(error);
}
});

And run! The file was written to ./dist/index.txt successfully, but there were a few warnings:

./src/index.txt > dist/index.txt
1:5-1:10 warning `STM32` is misspelt; did you mean `STM32nd`? stm32 retext-spell
1:11-1:26 warning `microcontroller` is misspelt microcontroller retext-spell
1:45-1:52 warning Expected the apostrophe in `doesn't` to be like this: `doesn’t` smart-apostrophe retext-contractions
1:113-1:116 warning Expected the apostrophe in `I'm` to be like this: `I’m` smart-apostrophe retext-contractions
1:210-1:214 warning `GPIO` is misspelt; did you mean `GPO`? gpio retext-spell
3:64-3:69 warning `STM32` is misspelt; did you mean `STM32nd`? stm32 retext-spell
⚠ 6 warnings

A few technical words ("STM32", "microcontroller", "GPIO") are incorrectly detected as misspelled. We can add a personal dictionary to resolve this.

echo "STM32\nmicrocontroller\nGPIO" > dictionary.txt

Now, we can configure retext-spell to use our personal dictionary.

import fs from "fs/promises";
const personal = await fs.readFile("./dictionary.txt", "utf8");
// ...
const processor = unified()
// ...
.use(retextSpell, {
dictionary,
personal,
});

Now, we only have quote errors remaining. retext-contractions expects us to use smart apostrophes. retext-smartypants adds those automatically. If you look at dist/index.txt, you'll see that doesn't is now doesn’t. So why is retext-contractions complaining?

Plugin Types and Plugin Order

The issue is the order that our plugins are being used. Since retext-contractions comes before retext-smartypants, the smart quote insertion happens after the smart quotes are checked.

However, you might notice that retext-stringify is the second plugin we use, yet the other plugins modify the tree before it is stringified and written to disk. Why?

The answer is that retext-stringify works a bit differently than you'd expect. Instead of performing some operation on the tree directly, it configures the processor object, setting itself as the compiler. This means that even though the plugin is one of the first in the pipeline, nothing is executed until the pipeline reaches the compile step.

Let's reorder our plugins. I'm going to list the parser first, then the plugins that modify the tree (retext-smartypants), then those that check the tree (including retext-contractions), and finally the compiler (retext-stringify). Again, the parser and compiler can go anywhere in the order, but placing them at the beginning and end reduces confusion.

Our pipeline should run without warnings!

Now's a good time to stop and test out some of the plugins we're using:

  • Spell something wrong for retext-spell to catch
  • Put an apostrophe in the wrong place (e.g. do'nt)
  • Miss some diacritics (e.g. a la carte)
  • Use a example instead of an example
  • Repeat a word, The bird in the the bush
  • Use gendered language (e.g. postman)
  • Use profane language (e.g. stupid)

Unfortunately, with any large amount of text, a lot of false positives can occur. In most cases, you'll only want to use a few of these plugins to lint your text. I'm going to remove retext-contractions and retext-diacritics at this step.

Markdown

You'll probably want to use Markdown for any serious work. Markdown allows you to embed links, images, code blocks, and other content into your text.

This is where the remark family of plugins can help. We'll use remark-parse to parse our Markdown files, and remark-stringify to convert the tree back to Markdown.

npm install remark-parse remark-stringify

Splitting the pipeline

Right now, our plugins are designed to work with and modify a text syntax tree. If we want to process Markdown, we'll need some way to convert the Markdown syntax tree into a text syntax tree.

This is more complicated than it sounds. We can't use a plugin to replace the Markdown syntax tree with a text one, since we still need to output the Markdown.

What we can do is create a separate pipeline that only deals with prose. Let's move our existing pipeline to a new file, prose.js.

import { unified } from "unified";
import retextEnglish from "retext-english";
import retextStringify from "retext-stringify";
import retextSpell from "retext-spell";
import dictionary from "dictionary-en";
import retextEquality from "retext-equality";
import retextIndefiniteArticle from "retext-indefinite-article";
import retextProfanities from "retext-profanities";
import retextRepeatedWords from "retext-repeated-words";
import retextSmartypants from "retext-smartypants";
import retextQuotes from "retext-quotes";
import fs from "fs/promises";
const personal = await fs.readFile("./dictionary.txt", "utf8");
const processor = unified()
// Parser
.use(retextEnglish)
// Transform prose
.use(retextSmartypants)
// Check prose
.use(retextSpell, {
dictionary,
personal,
})
.use(retextEquality)
.use(retextIndefiniteArticle)
.use(retextProfanities)
.use(retextRepeatedWords)
.use(retextQuotes)
// Compiler
.use(retextStringify);
export default processor;

Now, we can import this in our index.js.

import { unified } from "unified";
import { engine } from "unified-engine";
import processor from "./prose.js";
await new Promise((resolve, reject) => {
try {
engine(
{
processor,
files: ["./src/**/*.txt"],
output: "./dist",
},
resolve
);
} catch (error) {
reject(error);
}
});

Parsing Markdown

We need some Markdown to parse. I'm using this as my source. Save it as src/index.md.

Next, we'll make a new pipeline that can parse Markdown, using remark-parse and remark-stringify. We'll also configure our engine to look for .md files instead of .txt files.

import { unified } from "unified";
import { engine } from "unified-engine";
import remarkParse from "remark-parse";
import remarkStringify from "remark-stringify";
const processor = unified().use(remarkParse).use(remarkStringify);
await new Promise((resolve, reject) => {
try {
engine(
{
processor,
files: ["./src/**/*.md"],
output: "./dist",
},
resolve
);
} catch (error) {
reject(error);
}
});

Next, we need to actually do the split. If we want to bridge from Markdown to text, from remark to retext, we can use remark-retext!

npm install remark-retext
// ...
import remarkRetext from "remark-retext";
const processor = unified()
.use(remarkParse)
.use(remarkRetext)
.use(remarkStringify);
// ...

You'll notice that this gives an error, though. The remark-retext plugin is looking for some retext parser to use. This is where our original prose pipeline comes in.

// ...
import prosePipeline from "./prose.js";
const processor = unified()
.use(remarkParse)
.use(remarkRetext, prosePipeline)
.use(remarkStringify);
// ...

And now, the pipeline should run! You'll probably see some warnings about spelling, showing that the markdown content is getting fed through the spell checker. You might want to update the dictionary before moving on.

Mutating Markdown

Look at dist/index.md. What happened to the smart quotes? In the prose pipeline, we feed our text through retext-smartypants to convert straight quotes and apostrophes to curly/smart quotes. But that isn't being reflected in the Markdown output.

Once we split the pipeline, any changes we make to the text syntax tree won't propagate back to the Markdown tree. Splitting is a one-way process.

Thankfully, we can use remark-smartypants instead of retext-smartypants to mutate the Markdown tree.

npm install remark-smartypants

Add it to the Markdown pipeline:

// ...
import remarkSmartypants from "remark-smartypants";
const processor = unified()
.use(remarkParse)
.use(remarkSmartypants)
.use(remarkRetext, prosePipeline)
.use(remarkStringify);
// ...

Finally, remove retext-smartypants from the prose pipeline. Run the pipeline again, and you should see smart quotes in the Markdown output. You should also see that retext-quotes doesn't complain about quote usage. Since we apply retext-smartypants before splitting the pipeline, the changes are also reflected in the prose syntax tree.

More Markdown Plugins

Let's add a few more plugins to our pipeline.

  • remark-slug: Generate a slug for each heading, letting people link directly to it.
  • remark-gfm: Parse GitHub-style tables.
  • remark-frontmatter: Parse YAML frontmatter.
npm install remark-slug remark-gfm remark-frontmatter
// ...
import remarkSmartypants from "remark-smartypants";
import remarkSlug from "remark-slug";
import remarkGfm from "remark-gfm";
import remarkFrontmatter from "remark-frontmatter";
const processor = unified()
.use(remarkParse)
.use(remarkFrontmatter)
.use(remarkGfm)
.use(remarkSlug)
.use(remarkSmartypants)
.use(remarkRetext, prosePipeline)
.use(remarkStringify);
// ...

Try adding some frontmatter, tables, etc. to the Markdown file.

---
title: "Hello World"
---
| this | is |
| ---- | ----- |
| a | table |

Right now, we're just taking in Markdown and spitting it out. Let's try actually rendering it to HTML.

HTML

Just as remark is used for Markdown, rehype is used for HTML. Instead of writing our Markdown pipeline back to Markdown, let's transform it to HTML and write that out.

We'll need remark-rehype to transform Markdown to HTML, and rehype-stringify to write the HTML back to a file. We'll also use vfile-rename to rename the .md files to .html files.

npm install remark-rehype rehype-stringify vfile-rename

Now, we can add these to our pipeline. vfile-rename isn't a proper plugin, but it only takes a bit of code to make it work.

// ...
import remarkRehype from "remark-rehype";
import rehypeStringify from "rehype-stringify";
import { rename } from "vfile-rename";
const processor = unified()
.use(remarkParse)
.use(remarkFrontmatter)
.use(remarkGfm)
.use(remarkSlug)
.use(remarkSmartypants)
.use(remarkRetext, prosePipeline)
.use(remarkRehype)
.use(() => (tree, file) => {
rename(file, { extname: ".html" });
})
.use(rehypeStringify);
// ...

Run the pipeline again, and you should see index.html with the output.

Document Structure

You might notice that the index.html doesn't include <head> and <body>. In order to actually create a complete HTML document, we need to add those. rehype-document can turn an HTML fragment into a full document.

npm install rehype-document

Add it to the pipeline, and you should see a complete HTML document.

Title

Our index.html has a <title> tag, but it's just set to index. Ideally, we'd want to be able to set the title from the frontmatter.

There's no existing plugin to take care of this, but we can write one ourselves. We can extract the title from the frontmatter using remark-extract-frontmatter, and then use hast-util-select and hast-util-from-string to modify the <title> tag.

npm install hast-util-select hast-util-from-string remark-extract-frontmatter yaml

Adding these, our pipeline looks like this:

// ...
import { select } from "hast-util-select";
import { fromString } from "hast-util-from-string";
import YAML from "yaml";
import remarkFrontmatter from "remark-frontmatter";
const processor = unified()
.use(remarkParse)
.use(remarkFrontmatter)
.use(remarkExtractFrontmatter, { yaml: YAML.parse })
.use(remarkGfm)
.use(remarkSlug)
.use(remarkSmartypants)
.use(remarkRetext, prosePipeline)
.use(remarkRehype)
.use(rehypeDocument, {
title: "Untitled",
})
.use(() => (tree, file) => {
const title = file.data.title || "Untitled";
const tag = select("title", tree);
if (tag) {
fromString(tag, title);
}
})
.use(() => (tree, file) => {
rename(file, { extname: ".html" });
})
.use(rehypeStringify);

Set the title property in the frontmatter of the markdown, and check that it is updated in the .html output.

Formatting

Right now, the HTML output isn't particularly readable. We can add rehype-format to clean things up. Alternatively, you might want to rehype-minify to reduce the file size.

npm install rehype-format

Add this to the pipeline right before the call to .use(rehypeStringify).

Code

Add some code to the index.md.

```js
() => (tree, file) => {
const title = file.data.title || "Untitled";
const tag = select("title", tree);
if (tag) {
fromString(tag, title);
}
};
```

The HTML output is fine. The code is printed in a monospace font. However, in most cases, you'll want to display code with syntax highlighting. The Prism library is a popular choice, and it's supported in Unified through rehype-prism.

npm install @mapbox/rehype-prism
.use(rehypePrism)

This won't work on its own, however. We need to add the Prism theme to actually apply the highlighting. Thankfully, all we need to do is add the URL to the css option of rehype-document.

.use(rehypeDocument, {
title: "Untitled",
css: [
"https://cdnjs.cloudflare.com/ajax/libs/prism/1.25.0/themes/prism.min.css",
],
})

There it is! Language-specific code highlighting has been added to the pipeline.

Math

There are a lot of cases where you might want to include math in your Markdown. To accomplish this, math is typically written using LaTeX inside of $ blocks. Here's what it looks like:

$$ x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a} $$

Getting math to display in our pipeline takes two steps. First, when parsing Markdown, we need to add a new plugin to extract the math blocks into syntax tree nodes. Then, we need a plugin to render the LaTeX to HTML once we have our HTML syntax tree.

There are two major math libraries for the web: MathJAX and KaTeX. We'll proceed using KaTeX, since it is more lightweight.

npm install remark-math rehype-katex

Next, add these to the pipeline, and add the KaTeX CSS similarly to how we added the Prism theme. The pipeline looks like this:

const processor = unified()
.use(remarkParse)
.use(remarkFrontmatter)
.use(remarkExtractFrontmatter, { yaml: YAML.parse })
.use(remarkGfm)
.use(remarkMath)
.use(remarkSlug)
.use(remarkSmartypants)
.use(remarkRetext, prosePipeline)
.use(remarkRehype)
.use(rehypePrism)
.use(rehypeKatex)
.use(rehypeDocument, {
title: "Untitled",
})
.use(rehypeDocument, {
title: "Untitled",
css: [
"https://cdnjs.cloudflare.com/ajax/libs/prism/1.25.0/themes/prism.min.css",
"https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.15.1/katex.min.css",
],
})
.use(() => (tree, file) => {
const title = file.data.title || "Untitled";
const tag = select("title", tree);
if (tag) {
fromString(tag, title);
}
})
.use(() => (tree, file) => {
rename(file, { extname: ".html" });
})
.use(rehypeFormat)
.use(rehypeStringify);

Both remark-math and rehype-katex support both inline and block mode math. Inline mode can be written using a single $ as a delimiter, and block mode uses two $$.

Music

You also might want to include sheet music notation in your Markdown. The most popular notation for embedding music notation in websites is ABC. There aren't any working remark libraries for this, but we can write our own.

For syntax, let's use three backticks like a code block, and set the language to abc.

```abc
X: 1
T: Nokia Tune
M: 3/4
L: 1/8
K: Amaj
| e'd' f2 g2 | c'b d2 e2 | ba c2 e2 | a6 |
```

Now, we can start writing our plugin. Create a new file, music.js:

const remarkMusic = () => {
return (tree, file) => {};
};
export default remarkMusic;

Import the plugin and add it to the pipeline in main.js.

The next step is to select the music nodes in the syntax tree. Let's start by just inspecting the whole tree, to get a sense of what we're looking for.

import { inspect } from "unist-util-inspect";
const remarkMusic = () => {
return (tree, file) => {
console.log(inspect(tree));
};
};
export default remarkMusic;
...
└─10 code "X: 1\nT: Nokia Tune\nM: 3/4\nL: 1/8\nK: Amaj\n| e'd' f2 g2 | c'b d2 e2 | ba c2 e2 | a6 |" (42:1-49:4, 1623-1717)
lang: "abc"
meta: null

All right, our music is in the syntax tree, in a node with type="code" and lang="abc". Let's start by mapping the code nodes to abc nodes.

To perform this mapping, we can use unist-util-map.

npm install unist-util-map
// ...
import { map } from "unist-util-map";
const remarkMusic = () => {
return (tree, file) => {
return map(tree, (node) => {
if (node.type === "code" && node.lang === "abc") {
return {
type: "abc",
value: node.value,
};
} else {
return node;
}
});
};
};
// ...

If you run the pipeline now, you'll see that the ABC source is now just kind of dropped into the HTML. Unsurprisingly, remark-rehype has no idea what to do with it.

If, however, we add a data field to the abc nodes we create, we will be able to pass an HTML syntax tree node to remark-rehype.

const remarkMusic = () => {
return (tree, file) => {
return map(tree, (node) => {
if (node.type === "code" && node.lang === "abc") {
return {
type: "abc",
value: node.value,
data: {
hName: "div",
hProperties: {
className: ["abc"],
style: "color: red",
},
hChildren: [
{
type: "text",
value: node.value,
},
],
},
};
} else {
return node;
}
});
};
};

In this example, we just create a div for the ABC source, and show it in red.

However, if we want to actually render ABC to HTML, we'll need to use a library that works with the DOM. Let's get things set up beforehand by creating a DOM element and converting it to an AST node.

Since we're working with Node, we don't have access to a global document object from which to call createElement. Instead, we can use JSDOM. We'll also need hast-util-from-dom to convert the DOM node to an AST node.

npm install jsdom hast-util-from-dom
import { JSDOM } from "jsdom";
import { fromDom } from "hast-util-from-dom";
// ...
const remarkMusic = () => {
return (tree, file) => {
return map(tree, (node) => {
if (node.type === "code" && node.lang === "abc") {
const {
window: { document },
} = new JSDOM();
const renderInto = document.createElement("div");
renderInto.innerHTML = node.value;
renderInto.style.color = "red";
const data = fromDom(renderInto);
return {
type: "abc",
value: node.value,
data: {
hName: data.tagName,
hProperties: data.properties,
hChildren: data.children,
},
};
} else {
return node;
}
});
};
};
// ...

All right! We've done almost everything we need, all that's left is to map some ABC source to an HTML DOM node. Thankfully, considering the popularity of ABC notation, there's a library for that: abcjs.

npm install abcjs

Now, we just tell abcjs to render into our JSDOM node.

const remarkMusic = () => {
return (tree, file) => {
return map(tree, (node) => {
if (node.type === "code" && node.lang === "abc") {
const {
window: { document },
} = new JSDOM();
const renderInto = document.createElement("div");
ABCJS.renderAbc(renderInto, node.value);
const data = fromDom(renderInto);
return {
type: "abc",
value: node.value,
data: {
hName: data.tagName,
hProperties: data.properties,
hChildren: data.children,
},
};
} else {
return node;
}
});
};
};

If you're following along, you might get an error after this step. ABCJS relied in the global window object, which doesn't exist in Node, up until this commit. At time of writing, it hasn't made it into a proper release yet. As a workaround, you can install the 6.0.0 beta:

npm install abcjs@^6.0.0-beta

And... still doesn't work. When rendering to a DOM node, ABCJS tries to call document.createElement, which (obviously) fails. We will need to patch the package manually.

npm install patch-package

I added the following patch:

diff --git a/node_modules/abcjs/src/write/svg.js b/node_modules/abcjs/src/write/svg.js
index 174602b..fae9221 100644
--- a/node_modules/abcjs/src/write/svg.js
+++ b/node_modules/abcjs/src/write/svg.js
@@ -2,6 +2,9 @@
/*global module */
+const JSDOM = require("jsdom").JSDOM;
+const document = (new JSDOM()).window.document;
+
var svgNS = "http://www.w3.org/2000/svg";
function Svg(wrapper) {

And... there we go! Runs without issue, and transforms our ABC source into beautiful sheet music.

Pulling it all together

Let's throw a few more styles in, just to make things look a bit nicer.

.use(rehypeDocument, {
title: "Untitled",
css: [
"https://cdnjs.cloudflare.com/ajax/libs/prism/1.25.0/themes/prism.min.css",
"https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.15.1/katex.min.css",
"https://cdnjs.cloudflare.com/ajax/libs/bootstrap/4.6.1/css/bootstrap.min.css",
],
style: "body { margin: 0 auto !important; max-width: 800px; }",
})

And finally, let's throw some markdown at this! Here's a snippet that includes basically everything we're doing:

---
title: My Awesome Markdown
---
# Hello, World!
This is a Markdown document. It's a good test for our pipeline. I hope I ~~spellled~~ everything right. If someone finds a spelling **error**, she should let me [know](mailto:breq@breq.dev).
- This is a list item
- This is another list item
1. This is a numbered list item
1. This is another numbered list item
## Tables
| $x$ | $x^2$ |
| --- | ----- |
| 1 | 1 |
| 2 | 4 |
| 3 | 9 |
## Formulas
$$
x! = \begin{cases}
x = 0: & 1 \\
x > 0: & x (x - 1)! \\
\end{cases}
$$
$$
x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}
$$
## Code
```python
def bisect(f, a, b):
c = (a + b) / 2
if f(c) == 0:
return c
elif f(a) * f(c) < 0:
return bisect(f, a, c)
else:
return bisect(f, c, b)
```
```jsx
export default function Home() {
return <span>Hello, world!</span>;
}
```
## Music
```abc
X: 1
T: Nokia Tune
M: 3/4
L: 1/8
K: Amaj
| e'd' f2 g2 | c'b d2 e2 | ba c2 e2 | a6 |
```

And, let's render it one last time.

Looks nice! And we'll check the warnings:

./src/index.md > dist/index.html
8:3-8:11 warning `spellled` is misspelt; did you mean `spelled`? spellled retext-spell
8:71-8:74 warning `she` may be insensitive, use `they`, `it` instead he-she retext-equality
⚠ 2 warnings

The pipeline is warning us about both the spelling error and the unnecessary use of gendered language.

Summary

We've built a pipeline that, ultimately, turns Markdown into HTML. But we've used the Unified ecosystem to add plenty of other features:

  • Use YAML frontmatter to set the page title
  • Use Github-Flavored Markdown to render tables, strikethroughs, and other features absent from CommonMark (the commonly used Markdown spec).
  • Parse ABC notation and render sheet music
  • Add syntax highlighting to code blocks
  • Render LaTeX math formulas
  • Add slugs to headings to support direct linking
  • Convert simple/straight quotes into smart quotes
  • Pretty-print the output HTML

As well as our prose pipeline, which checks the source text for:

  • Spelling mistakes, including use of a personal dictionary to ignore certain words
  • Potentially insensitive or inconsiderate language (such as gendered pronouns)
  • Improper use of "a" versus "an"
  • Potentially profane language
  • Words that are are improperly repeated
  • Mistakes with quote usage

This is quite the feature set! And it goes to show just how broad the Unified ecosystem is. Most of these plugins could be added to the pipeline with just one line of code.

You can see the final result in this repo.

Epilogue

One final note: over the course of writing this, I actually decided to publish the music plugin to NPM. It's available as remark-abcjs, and it bundles a patched version of abcjs to avoid having to patch it yourself. Give it a try!

npm install remark-abcjs
]]>
https://breq.dev/2022/01/03/unified /2022/01/03/unified Mon, 03 Jan 2022 00:00:00 GMT
<![CDATA[LetMeIn]]>

A video of me unlocking my door. Don't worry, I changed my PIN afterwards.

Motivation

I wanted to learn how to use Puppeteer, since I've seen it used in different projects. Puppeteer automates a browser, and it's used for both automated testing and in various backend applications. For instance, VSinder used Puppeteer to automate screenshotting code snippets, which famously knocked carbon offline for while when it was DDOSed. Lore aside, I wanted to get experience with actually using it, since I figured it would likely come in handy at some point.

Technical Description

The code is just a hundred lines of NodeJS. The Puppeteer API makes heavy use of Promises, so I wrote it all using async functions. I'm using Koa to trigger the process when a HTTP request comes in—it's like Express, but based on promises instead of callbacks.

I had a tough time getting a consistent setup to automate. Northeastern uses its own SSO system, plus Duo for two-factor authentication. Annoyingly, these two are seemingly on different expiration timers, making reliable execution difficult.

I eventually decided to try removing all Northeastern cookies on each invocation and manually stepping through the sign-in process (typing in username and password), which worked well to consistently get through Northeastern's login step. For Duo, I tried to save cookies between invocations, but ran into unreliable behavior.

I could have configured Duo to send an SMS to a Twilio phone number and then read that into Puppeteer to enter the 2FA code, but I didn't want to spend money on this project. Thus, in the final video, I manually clicked "approve" on my phone. Sorry for the deception.

Results

I mean, it worked? It did unlock my door. That said, it's so impractical in its current state that I don't think I could salvage it into something actually useful. A couple key takeaways:

Browsers are made for humans, so they aren't deterministic. Browsers have functionality designed for humans, and even with puppeteer, there are going to be some hiccups when controlling them with automation. Even simple things like waiting for a page to be loaded can be hard—requests happen in an unpredictable order, so waiting for one specific event or request can introduce race conditions. The recommended solution is to wait a predetermined interval after the last web request is closed, which covers most edge cases but is nonetheless inefficient.

Browsers are huge, and that leads to a lot of overhead. Including a full Chromium instance in a project makes node_modules massive, and even spinning up a new tab to handle a request takes an appreciable amount of time. This is a setup that could work in a parallelized testing rig, but it's really an option of last resort for any production use case.

Not everything lends itself to automation. In this case, my script needed to wait for many different Northeastern sites to load, and considering the primary web portal I used has seemingly not been touched in a decade, it loaded quite slowly. Even though the process was automated, it wasn't actually that much faster than doing the steps by hand. This surprised me. I'm used to humans slowing down machines, so I figured I could get a significant speedup by applying automation to the problem, but the human was never the bottleneck in the first place.

]]>
https://breq.dev/projects/letmein /projects/letmein Sun, 19 Dec 2021 00:00:00 GMT
<![CDATA[flowspace]]>

The login screen for the flowspace web app.

Overview

flowspace is a social network website. It has a few basic features, such as direct messaging, friend requests, and public and private posts. Notably, there are two "tiers" of friends--the "wave" tier includes anyone you acknowledge and allow to message you, and the "follow" tier puts that person's posts into your feed. This outer tier maintains a gate around private messaging, helping to reduce the potential for harassment, but it also doesn't require someone to be shown every post from someone just to exchange messages.

The primary feed, containing posts of people you follow.

Motivation

Honestly, I kind of just wanted to take on a big project to learn and have fun. I definitely didn't have any hopes of actually building a userbase, considering the (aptly-named) network effect would make it pretty difficult to get anything off the ground. I figured a larger web project like this would involve a lot of interesting architectural decisions and using frameworks and services that I hadn't used much before.

Technical Description

The service uses a JAMstack architecture, separating the static client app from the dynamic REST API. The API backend is written in NodeJS.

Handling HTTP requests, middleware, routing, etc. is done with Koa and many plugins. Koa is a rewrite of the popular Express framework that uses Promises instead of callbacks--this results in much cleaner code in my opinion :)

The service uses three databases: PostgreSQL, S3 (compatible), and Redis. PostgreSQL is the primary data store, and it stores user profiles, messages, friend requests, posts, and the like. I'm using Prisma to make working with relational data easier. The S3 compatible service stores profile pictures, and it could be used to store message and post attachments as well. Initially, I used a self-hosted Minio container for this, but I decided to switch to GCP because of the generous free tier. Finally, Redis is used to handle rate-limiting the API.

The client app is a static single-page-application built with React and create-react-app. The CSS is all done in Tailwind. I've deployed it to Cloudflare Pages, again due to their generous free tier.

Authentication is handled through JWTs stored in localStorage. Upon login, a user will get two tokens: an access token and a refresh token. The refresh token will allow generating a new access token for up to seven days, letting users stay logged in for a while. Tokens are signed using both a secret key and the user's password hash, ensuring that if a user resets their password, any existing tokens will be automatically invalidated. Password resets and email verification is handled through SendGrid. (Gotta love free tiers, amirite?)

The messages page.

In addition to the REST API, a WebSocket "Gateway" endpoint is provided. This allows clients to subscribe to any message channel for updates, and receive any messages that come in on that channel. This is used for real-time direct messaging.

Results

It works! And while it's far from complete, the feature set is pretty good. I don't think I'm going to work more on it, considering it's first and foremost a learning project for myself and nobody else is using it much.

I hadn't worked with NodeJS in the backend before. Previously, my go-to was always Flask for these kinds of APIs, and I'd been tinkering with Quart as well (which is a reimplementation of Flask that uses an event loop via Python's asyncio library). Eventually, I asked myself why I was using this small fork of Flask just for the event loop feature when NodeJS was famous for its event loop and I was already familiar with JavaScript on the client side. I wasn't sure how long it would take to get up to speed with NodeJS, but I found it pretty easy to learn from a client-side background. (Having Node APIs not match Web APIs was a bit jarring, and I'd like to look into Deno at some point, but it just doesn't seem like the ecosystem is there yet.)

Working with a big relational database was also really interesting. The app has different relationship types: For instance, the mapping from users to posts is one-to-many, but the mapping of users to followers/friends is many-to-many. Learning how to express these relationship types in SQL was a surprising challenge for me, considering I hadn't really worked with data like this before.

This was also one of the first major sites I built with Tailwind. CSS is something I'm gradually starting to get better at, and I think Tailwind is helping with that. It really helped shift my perspective on CSS from "those weird layout commands" to "nuanced rules to declaratively describe the layout of any element as a function of its size, children, etc." Miriam Suzanne's video on "Why is CSS so weird?" is a really great starting point for this way of thinking, but I think it really takes time and practice to understand how to work within the CSS rules, instead of around them.

P.S. if you'd like to message me on there, make an account and browse to my profile at https://flowspace.breq.dev/u/AtACdumJQAA. I might not get your message unless you ping me elsewhere though, since I never got around to implementing push notifications :)

]]>
https://breq.dev/projects/flowspace /projects/flowspace Thu, 25 Nov 2021 00:00:00 GMT
<![CDATA[Tunneling, Routing, and NATting my way to smuggle IPv6 into Northeastern]]>

Northeastern's dorms have good WiFi... for most people. The speed is excellent, the coverage is great, and I have yet to encounter any content blocking. That said, for a tinkerer like me, I had some concerns.

Client Isolation

By default, the Northeastern network isolates devices from each other. On the surface, this is a pretty smart thing to do. It's great that my computer isn't directly accessible from every other machine on campus.

That said, it can be quite a headache to work around sometimes. For my wall mounted matrix display project, when I was at home, I would connect to it by typing http://matrix.local/ into my browser, and mDNS would do its thing, connecting me directly to the display.

Obviously, with client isolation, that wasn't going to work.

In a previous post, I wrote about the Wireguard server that I run for myself. After setting up a Wireguard client on the matrix, I was able to access it from any of my other devices, provided I remember to turn the VPN on.

Hey, if I have a VPN, can't I run IPv6 through it?

Northeastern's WiFi doesn't natively support IPv6. I want to have access to an IPv6 network, so that I can test the IPv6 functionality of my websites.

Nope. No IPv6 support here.

So, my next question was: could I route IPv6 through my VPN?

Getting IPv6 to Microsoft Azure

When I first set up my Dokku machine, IPv6 wasn't available for VMs. It wasn't released until April 2020. Ugh.

This ServerFault answer helped me figure it out, but basically, I needed to attach a completely new virtual NIC to the machine. Which meant a reboot... which meant downtime... which sucks, but it's kind of inevitable, since I'm hosting so many services on just one machine. (What can I say--as much as I'd love my own redundant Kubernetes cluster, that stuff's expensive to run.)

After changing the VM settings, NIC settings, VLAN settings, and VLAN subnet settings in the Azure portal, I was finally able to access the IPv6 internet from my Azure VM with its new IPv6 address.

...its single IPv6 address.

Yep, instead of assigning a /64 to my machine, Microsoft gave me just a single address.

Fine, I'll NAT it

The whole point of IPv6 was to increase the available address space, removing the need for one-to-many network address translation (NAT). With IPv4, there are only enough IP addresses to give each household or small office a single address to share between all devices. Although larger businesses can get larger blocks, they still need to share. NAT allows this sharing to take place.

IPv6 was supposed to remove this requirement entirely. Each subnet should be a /64, giving plenty of potential addresses to any connected clients. Then, end-users should get blocks of /56 so that they can create separate subnets as needed.

But, unfortunately, we don't live in that reality, so I had to resort to NAT.

Docker

Docker can be configured to support IPv6 by editing the /etc/docker/daemon.json file:

{
"ipv6": true,
"fixed-cidr-v6": "fd16:42d4:7eff::/80"
}

fixed-cidr-v6 is supposed to be the public IPv6 range to assign to Docker containers. However, since I don't have a range of addresses, I'm using a Unique Local Address, or ULA, range instead. The 42d4:73ff part was randomly generated to reduce the chance of collision.

Next, I needed to configure Docker to NAT this ULA range instead of trying to route it directly to the Internet. I found some code (ironically itself distributed as a Docker container) that sets up this NAT: robbertkl's docker-ipv6nat.

I'm aware NAT on IPv6 is almost always a no-go, since the huge number of available addresses removes the need for it. [...] I'm in no way "pro IPv6 NAT" in the general case; I'm just "pro working shit".

At least the repo is clear about being an ugly hack. It was easy enough to deploy to Dokku with docker-options add deploy --privileged --network host -v /var/run/docker.sock:/var/run/docker.sock:ro -v /lib/modules:/lib/modules:ro --restart=on-failure:10.

Wireguard Server

Now, I could access the IPv6 Internet from the Wireguard container itself, but I needed to expose this to the clients. I decided to just add another NAT so that I didn't interfere with how Docker keeps track of container IPs.

For this subnet, I chose a ULA prefix of fd16:4252:4551::/64, because I wanted something memorable, and 0x42 0x52 0x45 0x51 is the ASCII for "BREQ". It might not be RFC compliant, but at least it's cool.

I started by adding --sysctl=net.ipv6.conf.all.forwarding=1 to the docker options for the Wireguard container. I then added some additional iptables rules for the IPv6 traffic. Honestly, I'm no expert with iptables, so I just copied the default iptables rules but changed the command to ip6tables.

RuleExplanation
ip6tables -A FORWARD -i %i -j ACCEPTForward all traffic inbound to the Wireguard interface (wg0)
ip6tables -A FORWARD -o %i -j ACCEPTForward all outbound traffic from wg0
ip6tables -t nat -A POSTROUTING -o eth0 -j MASQUERADEApply many-to-one NAT to traffic exiting the container

Wireguard Peers

For every peer on the network, I needed to assign an IPv6 address on the subnet. The server became fd16:4252:4551::1, my desktop was fd16:4252:4551::2, and so on. I then added AllowedIPs=::/0 to each peer so that it routed all IPv6 traffic over the VPN.

Results

It's a pretty janky IPv6 implementation, especially since it has two layers of NAT when IPv6 wasn't meant to be used with NAT at all, but it works!

As a side effect, after adding an AAAA record to my DNS provider, all of my API services are now available over IPv6 as well. That means cards.api.breq.dev, botbuilder.api.breq.dev, etc, are all dual-stack all the way.

This was a lot of effort for, well, not a lot of importance. That said, I'm glad that I got to experience some of the issues with upgrading a complex network to IPv6. In many ways, it's a completely different mindset--under a well-thought-out IPv6 system, you want to make sure each node that might have a subnet behind it at some point has its own /64 to play with. Services like Docker and cloud providers built out their infrastructure without considering this, and in the process of patching IPv6 support in, they didn't adhere to these best practices, leading to struggles downstream.

]]>
https://breq.dev/2021/09/06/inv6 /2021/09/06/inv6 Mon, 06 Sep 2021 00:00:00 GMT
<![CDATA[PocKey - RP2040, SH1107, and lessons from a failed project]]>

This is a story about a project I made that failed. By "failed," I don't mean just something that didn't perform well or didn't meet my expectations. What I mean is, I invested countless hours into this project and I have nothing to show for it. That is, nothing but a story.

Did you come here looking for LVGL tips? Skip to here.

Looking for source code? Here's my CircuitPython work and my RP2040 SDK work on GitHub.

The Premise

For years, I had been wanting to build a musical instrument using Adafruit's NeoTrellis kit. This board, combined with a silicone keypad, creates a surface of 16 buttons with individual RGB lighting.

Picture from Adafruit demonstrating the NeoTrellis system.

This arrangement reminded me of the Novation Launchpad with its arrangement of colorful buttons. I decided to try to make a similar USB MIDI controller.

I then had to decide which microcontroller to pick. At the time, the RP2040 from Raspberry Pi was new, and I wanted to experiment with it. I eventually went with Adafruit's RP2040 Feather, and I threw in an SH1107-based display add-on as an afterthought.

I called it "PocKey," both because it was a pocket keypad/keyboard, and because I like Pocky.

Physical Construction

This, surprisingly, went pretty well. I mocked up a basic clamshell design in SketchUp...

Everything pretty much fit on the first try, except for a few issues with the Feather mounting holes:

Sorry about the horrible picture quality, I got a new phone with a better camera about a month after I took this picture...

With that out of the way, I turned to software.

CircuitPython

I hadn't done much with MicroPython before (or with CircuitPython, Adafruit's fork). I had used it a bit with the ESP8266, but hadn't had much success.

Thankfully, I had a much better experience this time around, at least at first. Adafruit publishes CircuitPython libraries for pretty much all of their boards, which really made throwing things together a lot easier. With the hardware interfacing abstracted away, I just had to focus on the application logic.

Tangent: Hot-Reload

Here is where the scope of the project started to expand a bit. I decided it would be interesting to have multiple loadable "apps" on the device, such as a macro keyboard, MIDI controller, or even some simple games. While I was building out an app loader, CircuitPython's hot reload functionality started to get in the way.

To edit code, open the code.py file on your CIRCUITPY drive into your editor. Make the desired changes to your code. Save the file. That's it! Your code changes are run as soon as the file is done saving.

While this functionality makes things easy for most projects, it started to get in my way, since changing a single app would require reloading the entire project, losing any persistent state. I decided to work on my own "hot reload" functionality.

This supporting code ended up turning into a full-on operating system, handling display updates, button interrupts, and a ton of other functionality. But it did end up making apps easier to write. Here's a basic one I wrote that provides macro keys -- two keys on the main board, and an additional one on the display:

import usb_hid
from adafruit_hid.keyboard import Keyboard
from adafruit_hid.keyboard_layout_us import KeyboardLayoutUS
from adafruit_hid.keycode import Keycode
from pockey.app import App
class KeyboardApp(App):
def __init__(self, pockey):
super().__init__(pockey)
self.keyboard = Keyboard(usb_hid.devices)
self.layout = KeyboardLayoutUS(self.keyboard)
self.mapping = {
0: "Hello World!",
1: "https://google.com/\n",
'A': "Hello hello!"
}
def setup(self):
self.pockey.text.enabled = True
self.pockey.text[0] = "Keyboard Demo"
self.pockey.trellis[0] = (255, 255, 255)
self.pockey.trellis[1] = (255, 255, 255)
def handle_button(self, number, edge):
if edge == self.pockey.PRESSED:
if number in self.mapping:
self.layout.write(self.mapping[number])
def mainloop(self):
pass
def teardown(self):
self.keyboard.release_all()
app = KeyboardApp

So far, things were going great! I did notice that the apps were starting to feel a bit sluggish, however...

Unreasonable Defaults: Auto-Writes and Auto-Refreshes

The CircuitPython documentation makes it clear that CircuitPython is intended primarily as an educational tool.

The easiest way to program microcontrollers: CircuitPython is a programming language designed to simplify experimenting and learning to code on low-cost microcontroller boards.

Going into this, I wasn't expecting CircuitPython to be the highest-performance library out there by any means. That said, I was still surprised with the amount of tweaking I needed to do just to get a halfway acceptable level of performance.

CircuitPython, by default, will immediately write any updates to NeoPixel strands or graphical displays. Admittedly, this makes things a bit easier to get started with--when I was just starting out with embedded software, I always forgot to call display.show() to push updates to the screen, and I would sit there and wonder why nothing was happening.

That said, this approach makes performance much worse. The reason for this is straightforward: generally, multiple things are drawn on the screen at a time, and doing a refresh for each one of these things will be a lot slower than just doing a single refresh at the end.

Fortunately, this behavior was simple enough to disable by setting trellis.pixels.auto_write = False and passing auto_refresh=False to the adafruit_displayio_sh1107.SH1107() constructor.

Update diffing and other trickery

At this point, I was still having trouble with performance. I decided to try to reduce unnecessary updates further by calculating the difference between the new and old state. Here's how I did that with the NeoPixels:

def sync(self):
self.trellis.sync()
virtual = self.virtual
actual = self.actual
dirty = False
for pixel in range(16):
if virtual[pixel] != actual[pixel]:
self.trellis.pixels[pixel] = virtual[pixel]
actual[pixel] = virtual[pixel]
dirty = True
if dirty:
self.trellis.pixels.show()

I did some similar processing of the display output by tracking the text that was on the screen. While this did significantly improve performance, it still wasn't up to the level I wanted.

As an aside: You might notice, in this code snippet, I used the lines virtual = self.virtual and actual = self.actual. That was yet another attempt at optimization. The rationale behind this was from this article by Uri Shaked, which explains that caching references from self can improve performance.

Calling it quits

At this point, the "button-press-to-MIDI-note" latency was inconsistent and higher than I would consider acceptable. I had an application that had exploded in complexity, no good way to profile it, and no clear path to better optimization other than approaches that bordered on cargo cult programming. Without a clear path forward, I decided to give up on CircuitPython.

So... what to do now? The RP2040 didn't have well-documented Arduino support at the time (this guide wasn't published until this June). I decided to try out the RP2040 SDK from the Pi Foundation. I'm not super familiar with C++ and its toolchains, but how hard could it be?

The RP2040 SDK

The RP2040 SDK comes with great libraries... for the RP2040's own peripherals, that is. At the time, the chip was so new that I couldn't find compatible libraries for the NeoTrellis or the SH1107. So, I decided to try to roll my own.

Adafruit Seesaw

The NeoTrellis is based on a protocol that Adafruit created for I/O expansion devices called Seesaw. The protocol is built on I²C, and it is based off of multiple modules, each with its own functions. For instance, the NeoPixel module (which controls the RGB lighting in each key) is module number 0x0E. Setting a given pixel value is function number 0x04. There are some precise timing requirements with the protocol, but in general, it's just about doing an I²C write to the Seesaw's address, and setting the payload to the module and function numbers followed by any additional arguments. To read data from the Seesaw, it just takes an immediate I²C read afterward.

I wrote a Seesaw driver in C++, which meant writing to NeoPixels was as easy as this:

void NeoPixel::set(uint16_t number, uint8_t r, uint8_t g, uint8_t b) {
number *= 3; // each pixel is 3 bytes in the buffer
uint8_t data[] = {(uint8_t)(number >> 8), (uint8_t)(number & 0xff), g, r, b};
seesaw.write(
MODULE_BASE,
BUF,
data,
5
);
}
void NeoPixel::show() {
seesaw.write(
MODULE_BASE,
SHOW,
nullptr,
0
);
}

Reading data from the keypad was similarly just a set of Seesaw commands.

With this out of the way, I was feeling pretty optimistic. Sure, I had to write my own library implementing the protocol, but at least I got something working! This is when things started to take a turn...

SH1107

Adafruit links a datasheet with information about the display. However, it includes almost no information other than the I²C messages to send for the OLED startup procedure. It doesn't even list the available I²C commands. And, parts of it are written in Chinese with either no English translation or an incomprehensible attempt at one. I ended up basing my implementation mostly off of another datasheet I found here, as well as using Adafruit's CircuitPython SH1107 library as a reference.

The SH1107 framebuffer mapping is somewhat unconventional compared to other hobbyist displays. Most displays allocate one or more bytes per pixel, but the SH1107 only uses one bit per pixel, since it is a monochrome screen.

LVGL

Now that I understood the SH1107 mapping, I needed a library that could draw simple shapes and text onto the screen. A popular option I found was LVGL. I had been wanting to try LVGL for a long time, mostly because it's the default graphics library for PROS.

I got to work porting LVGL to the SH1107, and I immediately noticed that it was going to take some trickery to get LVGL to play nice with this weird pixel mapping. Thankfully, LVGL provides a few callbacks that can be customized:

LVGL Display Driver CallbackDescription
flush_cbCallback to flush the display buffer data to the display hardware
rounder_cbCallback to broaden the update area if necessary to ensure it lines up with display pages
set_px_cbCallback to set a specific pixel value within the display buffer

I got to work on my implementations.

rounder_cb

This one is the most straightforward. All I needed to do was ensure that the X coordinates of the update area lined up with the page boundaries (every 8 pixels).

Within a page, all the pixels in any row are stored in the same byte. Therefore, if we want to update part of the display, we need to ensure that the area we're updating is aligned to the page boundaries, since we can't update individual bits.

In this diagram, each vertical black line represents a boundary between two pages. The original update area, shown in red, is extended to line up with the page boundaries.

I started by rounding the first coordinate down to the nearest multiple of 8 by masking off the last three bits. Then, I did the same for the second coordinate, adding 7 to bring it to the end of the page.

void Display::round(lv_disp_drv_t* disp_drv, lv_area_t* area) {
area->x1 = area->x1 & ~0x7;
area->x2 = (area->x2 & ~0x7) + 7;
}

set_px_cb

Next, I took on the callback for setting pixel values.

I started by calculating the page, column, and bit of the given pixel. Then came the tricky part--I found the number of bytes per page. LVGL sometimes only updates part of the display at a time, so it might decide to only update half of the display.

Finally, I found the offset into the buffer, and used a mask to flip the bit.

void Display::set_pixel(lv_disp_drv_t* disp_drv, uint8_t* buf, lv_coord_t buf_w, lv_coord_t x, lv_coord_t y, lv_color_t color, lv_opa_t opa) {
uint16_t page = x >> 3;
uint16_t column = y;
uint8_t bit = x & 0x7;
uint8_t mask = 1 << bit;
uint16_t bytes_per_page = disp_buf.area.y2 - disp_buf.area.y1 + 1;
uint16_t buffer_index = (page * bytes_per_page) + column;
if (color.full == 0) {
buf[buffer_index] |= mask;
} else {
buf[buffer_index] &= ~mask;
}
}

flush_cb

One more. For this one, I found the starting page, ending page, starting column, and number of bytes per page. I then iterated through each page.

For each page, I set the display page address to the current page, then set the column address to the starting column (since there are more than 16 columns, this is split into 2 commands, one for the high bits and one for the low bits). Finally, I found the index into the buffer, and sent the number of bytes per page.

void Display::flush(lv_disp_drv_t* disp_drv, const lv_area_t* area, lv_color_t* color_p) {
uint8_t start_page = area->x1 >> 3;
uint8_t end_page = area->x2 >> 3;
uint8_t start_col = area->y1;
uint8_t end_col = area->y2 + 1;
uint8_t start_col_high = (start_col >> 4) & 0x7;
uint8_t start_col_low = start_col & 0xF;
uint8_t bytes_per_page = end_col - start_col;
uint8_t* color_buffer = reinterpret_cast<uint8_t*>(color_p);
for (uint8_t page_offset = 0; start_page + page_offset <= end_page; ++page_offset) {
send_command(PAGE_ADDR | start_page + page_offset);
send_command(COL_ADDR_LOW | start_col_low);
send_command(COL_ADDR_HIGH | start_col_high);
uint16_t buffer_index = page_offset * bytes_per_page;
uint8_t* data = color_buffer + buffer_index;
send_data(data, bytes_per_page);
}
lv_disp_flush_ready(disp_drv);
}

I write these as if this was a straightforward process. In reality, getting these callbacks right took me about a week of trial and error. I spent so long troubleshooting edge cases that only occurred for specific update area sizes, and struggling to understand how the addressing of the SH1107 worked to begin with. But finally, I had a working display driver. Now, all I needed was the USB functionality to send keystrokes or MIDI input to the computer.

TinyUSB

The RP2040 SDK includes TinyUSB as a high-level USB library. The Pi Foundation provides no documentation for this library. The TinyUSB docs say that...

It is relatively simple to incorporate tinyusb into your (existing) project

...but they provide almost no documentation. Seriously, what is "Implement all enabled classes's [sic] callbacks"??? What classes are enabled? What classes should I enable? What callbacks do they have? What does my implementation need to include???

It is at this point where I gave up on this project.

Conclusion

Well, this is it. I'm faced with a project I spent countless hours on, without anything to show for it. So what did I learn?

Introducing abstractions for short-term speedup can lead to technical debt in the long run. By trying to optimize the CircuitPython build as much as possible, I introduced complexity that left me with an application that was harder to understand. Without clear knowledge as to what my code was doing, I was left without any clear way to improve it.

The flashiest solution isn't always the best. If I had picked a chip with stable Arduino support, I could've taken advantage of existing libraries while keeping the speed of C++. Choosing the brand new RP2040 put me on the bleeding edge.

Pick the right tool for the job. Python, on a microcontroller, for a latency-critical application... Even though CircuitPython let me get up and running quickly, it couldn't achieve what I was targeting, forcing me to rewrite everything from scratch in C++.

Fail early. One of the main dealbreakers of the project was that the keypad buttons didn't respond well to "drumming" input--they needed to be completely pressed down. Instead of recognizing this flaw in the design, I kept investing effort into the project anyway.

Focus on the MVP. In the early stages, I invested a lot of time in building out the app-specific hot reload feature. If I had only focused on the core MIDI functionality, I would have faced the latency and usability issues a lot sooner. By worrying about side features instead of the minimum viable product, I was delaying the inevitable.

Epilogue

Ironically, two weeks after I gave up on this project, Adafruit released an extremely similar design as a kit, trading the three buttons for an encoder and the silicone keypad for Cherry MX (clone) switches.

I'm not disappointed in myself for how this project went. I learned more about bit operations in embedded programming, the mechanics of I²C, and the tooling required to manage a large C++ project. I also learned some important lessons about project management and planning. I'm glad I'm experiencing failure like this now, when the only casualty is a bit of my spare time.

Oh, and when I said I had nothing to show... that isn't entirely true.

]]>
https://breq.dev/2021/08/29/pockey /2021/08/29/pockey Sun, 29 Aug 2021 00:00:00 GMT
<![CDATA[React Twitter NoTrack]]>

react-twitter-notrack does exactly what it says on the tin: allow you to embed Tweets as React components without exposing your users to tracking. You can install it from npm with npm i react-twitter-notrack.

Unfortunately, things seem to have stopped working, probably as a result of Twitter improving their bot detection or shutting down their API. There were enough tradeoffs inherent in this project that I don't intend to continue maintaining it. Thus, it is no longer usable.

import { Tweet } from "react-twitter-notrack"
function App() {
return (
<Tweet id="20" apiUrl="https://twitter-proxy.breq.workers.dev">
)
}

Motivation

I wanted to be able to embed Tweets on my website, so I looked into React-based Twitter embed libraries. The two that I found, react-twitter-widgets and react-twitter-embed operated similarly: they both used the Twitter widgets.

Twitter has a system of widgets that can be embedded on a website. They work by using a JavaScript library to dynamically include iFrames onto the page. I'm not a huge fan of this approach, for a couple reasons:

  • The iFrames tend to load in after everything else on the page, leading to a huge layout jump.
  • Embedding iFrames uses more resources.
  • Users who use tracker-blockers, like Firefox Enhanced Tracking Protection, might not see the Tweet at all.
  • Allowing Twitter to execute JavaScript on my webpage exposes my users to tracking without their consent.

I decided to try making my own Twitter embed, built as a pure React component without any imperative DOM manipulation.

Technical Description

I started building out this project using Storybook. I had wanted to try out Storybook for a while by now. I really enjoyed how quickly it allowed me to iterate.

Using an official Twitter widget as a model, I tried to emulate the design as best I could. I built out this mockup using styled-components, which was another first for me. I think that styled-components was a good fit for this project--it's a lot lighter-weight than Tailwind, and there was no need for me to stick to a broader design system like my website as a whole. That said, for more ambitious projects, I'll probably stick with Tailwind to keep things looking cohesive.

Then came the data fetching. I noticed that the official Twitter embed was sending a request to cdn.syndication.twimg.com:

...and getting back a response with info about the tweet:

So, just call this endpoint from the React component and we're good, right?

...nope. Twitter uses CORS to only allow platform.twitter.com to access this endpoint. For the official embed, this isn't an issue, since the iFrame is loaded from that origin. But for our pirate embed, we'll need to find another way.

I ended up building a proxy with Cloudflare Workers to spoof the Origin header to Twitter and send back a permissive Access-Control-Allow-Origin to the client. This is pretty much the same approach I used for GenReGen's Pastebin proxy.

With that out of the way, I just had to fetch the data from the proxy. I used trusty old useSWR to get the job done.

Results

This library doesn't produce embeds with as much polish as the official Twitter ones -- they don't show the original tweet when embedding a reply or quote tweet, and they don't show more than one image at a time for now. But overall, I think this resulted in something usable and more performant than the official embed. For a while, I used it for all the Twitter embeds on this website, and it checked all the boxes. I don't know if anyone else ever found it useful, but I'm still happy that I shared my work on NPM.

]]>
https://breq.dev/projects/react-twitter-notrack /projects/react-twitter-notrack Fri, 27 Aug 2021 00:00:00 GMT
<![CDATA[Building My Online Presence]]>

I've always wanted to keep a personal website up and running. In part, that's because I want to make sure everything I work on has a home; hopefully, people can use my project writeups for inspiration or advice on their own creative endeavours. It's also partly because I want a place to share my thoughts. The main reason, however, is to keep a centralized directory of who I am.

For much of my life, my identity has been, for lack of a better word, unstable. This is an outlet for me to keep track of what makes me who I am. It's a place where I can post things that I'm proud of.

(Funnily enough, it was this period of questioning that led me to pick the name breq.dev. At the time, I knew I was deeply uncomfortable with my name, but I didn't yet understand why. When it came time to pick a domain name, I ended up scrolling through the AVR instruction set until I found something somewhat pronounceable, and it's kinda stuck!)

With that out of the way, here's the journey my online presence took from both a technical and personal perspective.

My First HTML

The first HTML file I ever hosted publicly was published on August 19, 2015. Here's that fateful commit:

As a 12-year-old with no money, it's no surprise I turned to GitHub Pages for hosting. And, as you can tell from the commit, I hadn't exactly figured out code indentation yet.

Over the years, this site turned into a disorganized collection of random .html files I happened to experiment with. This included Gemini and Bounce.

The Paper Feed

In around 2019, I started to take this website idea a bit more seriously. I started what I called the Paper Feed -- a single-page site where I would post some thoughts every couple weeks. Here's the only surviving screenshot (which is probably for the best):

It wasn't much -- just a basic CSS theme. But it was mine, and it was the first time I started to see my online presence as something I should curate and something that people might be interested in.

Jekyll, and the beginnings of the project log

Ouch. The lack of contrast hurts my eyes.

You can spot a few cool things in this revision though:

  • No more deadnames to remove! Yay!
  • My logo -- the cube with the half adder -- makes its first appearance.
  • "Welcome to my little patch of internet" is still my website slogan to this day.

It didn't take long for me to start throwing more interesting stuff on that site, however:

I started to really use CSS for things at this point, as you can tell. This page was built with Bootstrap as a base. I also started to make use of Jekyll's layout features to reduce duplication across pages, and the data features to define the navbar with YAML.

The site had two pages: an "about" page listing my contact info, and a "projects" page containing writeups for all of the projects. Usability wasn't great -- the project page was a giant wall of text -- but those initial descriptions grew into the writeups I currently have on my site.

Next came the wireframe logo. Projects started to go into their own pages as well. Things were really starting to take shape.

I added autoplaying videos to bring my projects front and center.

As I started to accumulate more projects, I switched to a tile-based layout to fit more on the homepage.

Spinning off jekyll-theme-breq

After this, I decided to spin off a Jekyll theme with my navbar style into a separate package. I'm still using this theme for my emoji keyboard.

Not much has changed really here -- I've switched to a light theme which is easier on the eyes, and I've added a big autoplaying video to the homepage. And, finally, I put the name "Brooke" in big letters! I gotta be honest, that felt good.

Gatsby, and false starts

At this point, however, I was starting to feel like I'd reached the limits of what Jekyll could comfortably handle. Project information was split up between each project file and the YAML data files, and splitting off my theme into its own project hadn't made things simpler like I'd hoped.

I decided to make the switch to Gatsby. It was based on React, which I had heard about but wasn't too comfortable with initially. I was, however, impressed by its claims of legendary performance.

It took me a couple tries to make the switch. Gatsby's data layer took me forever to get used to. Thinking of each resource as a "node" and understanding how each step in the pipeline would process that node was a big shift in how I thought about content and data.

On my third or so attempt to learn Gatsby, I ended up with something feature-complete and decided to make the switch. And I'm glad I did -- I'm starting to really love React, and the image optimization, data prefetching, and simple hot reload that Gatsby brings to the table have made my final site much easier to develop and interact with.

I also switched to Tailwind for CSS around this time, and found it to be the perfect styling system for working with React components. Having CSS in a separate file, even with CSS modules, was something I always tried to find a way around. Tailwind, on the other hand, helps me stay organized.

Where we are now

I think I've settled on a model that works, both in terms of my technology stack and in terms of the role this website serves in my life.

Preserving my work is important to me -- even if nobody else ever reads it, I want to have this here as a resource for myself, about myself, to remind me of who I am.

]]>
https://breq.dev/2021/08/26/website /2021/08/26/website Thu, 26 Aug 2021 00:00:00 GMT
<![CDATA[GenReGen]]>

Overview

"Genre Gen"-erator? "Gen"-erate "Regen"-erate? Regardless of what the name means, this is a program that'll take in a list of items and spit out randomized mashups.

Motivation

I received an email from my uncle Mitch asking for a random genre mashup generator to help with coming up with ideas for movies. I wanted to make something stable, that would continue working even without my involvement.

Technical Description

The brief was simple enough -- just pull two random items from a list. I didn't even use a framework for this; it's all done with imperative DOM manipulation. Since it's just a few static files, I'm hosting it on Cloudflare Pages. Implementing a few other features, like going backwards through the list of generated mashups, was also pretty trivial.

What complicated things a bit was Mitch's request that the list be editable. I didn't want to manage an authentication system to restrict access to a central list, since that would require hosting and a custom backend solution (which is more expensive to run, more difficult to implement, and more work to maintain).

The hack that I settled on was using Pastebin as a source. Anyone can upload a list of things to Pastebin, then put the URL of that list in the "source list" box to generate random pairings from that list. The JavaScript will detect changes to this field and pull the list from Pastebin, using it for all subsequent mashups. It'll even save the most recent URL in the browser localStorage, so that the user doesn't have to keep re-pasting it.

Of course, there was another hurdle: CORS. Yes, it helps keep us safe on the web, yes, it protects private data and intellectual property rights... but in this case, it was just another issue to work around.

Oh, but Pastebin adds the Access-Control-Allow-Origin header, right...?

Ah, capitalism strikes again.

To solve this, I ended up using Cloudflare Workers to make a simple proxy that adds CORS headers on top of the Pastebin API.

addEventListener("fetch", (event) => {
event.respondWith(
handleRequest(event.request).catch(
(err) => new Response(err.stack, { status: 500 })
)
);
});
async function handleRequest(request) {
const incomingOrigin = request.headers.get("Origin");
if (!/(breq\.dev|genregen\.pages\.dev)/.test(incomingOrigin)) {
return new Response("Origin Not Allowed", { status: 403 });
}
const url = new URL(request.url);
const paste = url.searchParams.get("paste");
request = new Request("https://pastebin.com/raw/" + paste, request);
request.headers.set("Origin", "pastebin.com");
let response = await fetch(request);
response = new Response(response.body, response);
response.headers.set("Access-Control-Allow-Origin", incomingOrigin);
response.headers.set("Vary", "Origin");
return response;
}

Results

Honestly, it does everything I had hoped it would. The Pastebin solution is a bit janky, but in the end, I'm glad I got everything working without any recurring costs or complex backends to maintain.

]]>
https://breq.dev/projects/genregen /projects/genregen Thu, 26 Aug 2021 00:00:00 GMT
<![CDATA[Wall Matrix]]>

A video of me doing the final assembly of the display.

Overview

This is a sign that I built that hangs on the wall and shows information from the Internet.

Motivation

I bought an LED matrix panel from Adafruit years ago, and I've tried to use it to display things in the past, but I had never figured out a good way to mount it. Around the time I built this project, I had just gotten some threaded inserts for use with 3D printing, and I figured making a case for this display would be a good project to use them for.

Technical Description

Hardware

My 3D printer can only do up to 120mm in each dimension, so I needed to split the print up into two parts. I used threaded inserts to join these together, and I did the split slightly off-center so that the mounting hole would be stronger.

To attach the Raspberry Pi, I had hoped to screw it into the threaded inserts as well, but the RPi's mounting holes are M2.5 while the inserts I got are M3. I ended up making pegs that fit into the Pi's mounting holes, and a "seat belt" to hold the Pi against them. This solution was surprisingly sturdy.

Finally, it came time to connect the display to the Pi. While Adafruit sells a HAT to make the connections easy, it doesn't use the default mapping for the library I had planned on using. I decided to build my own on a Perma-Proto, and to leave room in case I wanted to add other matrix strands or devices in the future.

Software

The matrix is driven using hzeller's rpi-rgb-led-matrix library. Specifically, I'm making use of the Python bindings.

The code is split into four major sections:

The Driver

The driver exposes an interface allowing other components in the stack to send an image to the matrix. I wrote two drivers: a "real" driver that shows the image on the matrix, and a "fake" driver that draws the image in a TkInter window. This allows me to test out new sources, images, and designs on my own computer before deploying them to the Pi.

The Server

The server is written in Flask, but I've turned off the threading abilities so that the driver can remain a singleton. (Normally, this is a bad idea, but I've decided that it's fine, since this is an embedded device anyway.) It loads in any available sources, retrieves an image from the current source, and handles incoming requests to change the active source or interrupt the source with a scrolling message.

The server and driver communicate over a message queue. The server will send a SOURCE_CHANGED message when the user chooses a different source, and it will send a FLASH_MESSAGE message when the user submits a message to show.

The Sources

I've written four basic sources so far.

ColorBars: Shows the basic color bar test image on the screen.

Weather: Shows the current time, temperature, and weather on the screen, using the OpenWeatherMap API.

Crypto: Shows the current price and 24-hour percent change of a cryptocurrency (defaults to ETH), using the CoinMarketCap API.

MBTA: Shows the next northbound Green Line and Orange Line trains passing through the Northeastern University campus, using the MBTA's official API.

These sources inherit from a base Source class, which has some data caching logic built in. By default, APIs are only called once every 60 seconds, but the Crypto source caches data for 5 minutes due to the more restrictive API license.

The Client

The client is written in static HTML/CSS/JS. It features a dropdown menu to change the source and a textbox to input messages to flash. I designed it to be usable on a phone, since I figured that's how most people would like to control the sign.

Results

The finished result looks a lot more polished than I was expecting! I had a lot of bugs to work out at first -- it turns out there are a lot more edge cases than I thought there would be when it comes to displaying time -- but it's been quite reliable.

]]>
https://breq.dev/projects/wallmatrix /projects/wallmatrix Thu, 26 Aug 2021 00:00:00 GMT
<![CDATA[Bounce Homepage]]>
View the live demo!

Bounce was a homepage I made to make my school laptop just a bit more unique from everyone else's. It features three bouncing circles. Pressing space adds circles to the screen, while pressing c takes them away.

I wanted to have fun and make something unique, and in the end, I was really proud of the result! It was cool to take an open-ended idea like "make a more interesting web browser home page" and brainstorm different implementations.

]]>
https://breq.dev/projects/bounce /projects/bounce Thu, 26 Aug 2021 00:00:00 GMT
<![CDATA[Gemini]]>
View the live demo!

Overview

Gemini was a simple canvas game I built. The player had to guide the black circle towards the white circle using the arrow keys, but the black circle is pushed around by a randomly changing current.

Motivation

I wanted to make a simple game in order to learn JavaScript. I worked on this during a summer camp at MSSM, in an introductory programming course.

Technical Description

Gemini was certainly a product of 2016. It was built with jQuery. The two balls are represented as JavaScript objects, but they're created using the prototype interface, as ES2015 classes weren't widely supported or documented at the time.

Results

It was something I could show my friends, and it was something that helped me learn JavaScript, so I'd call it a success! That said, I'm glad my code quality has improved since I wrote this... :)

]]>
https://breq.dev/projects/gemini /projects/gemini Thu, 26 Aug 2021 00:00:00 GMT
<![CDATA[Motion Sickness Fish]]>

Overview

Exactly what it says on the tin. It's a Billy Bass fish that's had its electronics replaced and had Motion Sickness loaded on.

For those of you lucky enough to not know what the Billy Bass fish is, let me enlighten you:

And here's the song I loaded on:

Motivation

My uncle and aunt asked me to take this one on. The entire project needed to be finished in about a week. I had finals for the first half of the week, and I knew it would take a while to ship to my relatives in California, but I concluded that I could barely make it work if I ordered the parts right away, rebuilt the fish as quickly as possible once they arrived, and shipped the result as soon as possible.

Technical Description

Hardware and Construction

The chip that controls Billy Bass and plays the default music is covered in epoxy. Much smarter people than me struggle to do this kind of thing. I quickly ruled this option out.

What about using the existing speakers though? Well, at the time, I didn't have a physical Billy Bass handy (remember, shipping...) so I couldn't be sure of the specifications of these speakers.

The easiest solution would be to completely replace the built-in electronics with my own.Knowing that I had only a few days to get this working, and no opportunity to re-order parts if something went wrong, I knew that I needed to get this right.

Normally, I would DIY as much as possible, with hand-soldered perfboards, hectic wiring, and all sorts of bodged-together boards. It became obvious to me that this was the wrong approach here. This was my first lesson: know when too much DIY just won't cut it.

I needed a cohesive ecosystem. Adafruit has been my go-to source for parts over the past couple years, and being based in NYC, I knew shipping wouldn't take too long. But that still left me with plenty of options.

I started by identifying what physical hardware I'd need. Adafruit has a line of boards based off of the VS1053 chip--that takes care of my audio playing. And there are plenty of motor boards available in any form factor I would want. So what do I choose?

Adafruit's board selection generally varies by size and power supply. The VS1053 modules were available in 3 different form factors: a simple breakout board, a FeatherWing board, and an Arduino Shield board.

The breakout board would require wiring it up to a board myself, which eliminated it from the running. Feather boards and Arduino boards are available with any sort of chip, so that's not a concern. What differentiates the boards is physical size and power supply. Feathers are small, and can be powered off of a 5 volt source or a LiPo battery. Arduinos are much larger, but they can take anything from 6 to 20 volts.

Here I faced a dilemma. I wanted the electronics to be small, so that they could fit inside the Billy Bass enclosure without any sort of external box. I also was wary of running the chip and the motors off the same power rail, due to reset issues I'd experienced in the past. The Arduino, with it's built-in power regulator, seemed like functionally the better choice, as it could take a standard barrel plug power supply, run the motors directly off of it, and run the microcontroller chip off a clean, regulated 5V.

However, I wasn't sure if the larger Arduino form factor would fit. I considered rigging something up with a custom perfboard to attach a voltage regulator to the Feather... but I eventually decided against it. I needed to play by the rules of the ecosystem so that I could take advantage of its ease-of-use. I bought the VS1053 and the motor driver for the Arduino.

Here's the finished product! You can see the Adafruit Metro (Arduino clone), the speakers, and not much else. Because the VS1053 chip and motor driver were mounted directly on top of the Metro, there was no need for messy mounting solutions or manual wiring.

Movement Sync Software

And now, the race was on to get something functional as fast as possible. Getting the song to play over the audio shield and speakers was easy, thanks to the well-documented Adafruit libraries. That left only the fish motion to take care of.

Getting something working this quickly would have been impossible if I had tried to roll my own amplifier circuit, or wire together my own power supply circuit, or do any of that. Here was the ecosystem at work for me, allowing me to go from idea to concept in record time.

The next step was to actually move the fish. I quickly wrote functions to control the fish's head, tail, and mouth, again using the excellent library Adafruit provided for their motor shield. Normally, I would've hacked something together with an L293D in order to save a buck or two. But in this case, the motor shield was worth every penny.

Now, the question is, how do I decide when to activate these movements?

Onboard Audio Processing?

The official Amazon Alexa enabled Billy Bass reacts to the audio to decide when to move. This... doesn't exactly work well, as you can see in this Linus Tech Tips review at around the 5:03 mark:

"Okay, so he just kind of spasms then. So basically, you just have like a fish out of water on your wall whenever music is playing."

Yeah... that wasn't gonna cut it.

I decided I needed to manually choreograph the movements to match the music. No reacting-to-audio trickery would save me from this. I did attempt a few basic tests of reacting to audio loudness, but my results were even worse than the fish in that video.

Pre-Recording Movements

I realized that I would need to write some custom software if I wanted to precisely record all of these movements in relation to the song. I needed a framework that would let me play audio, handle raw keyboard events, and write in a language I could iterate quickly in. PyGame fit the bill, and I was quickly off to the races.

My script waited for the user to press the space key. It then started playing the music and keeping an internal timer. Then, whenever another key was pressed, it would log the event and the time at which it happened. Finally, it saved it to a JSON file. Pretty basic stuff.

Did I have to decouple the recording from the code generation? Probably not. But I figured it would be best to have a full record of all the movements over the course of the song, so I could try different methods of storing that on the fish if I needed to.

Storing the routine

If I had the luxury of time, I might've gone with a packed binary representation like I did for my STMusic project. However, writing the code to generate, store, and load this packed representation would be a lot of work.

I eventually landed on the idea of using a Python script to take the JSON object and generate C code that called each movement function.

import json
with open("recorded.json") as file:
actions = json.load(file)
code = []
action_lines = {
"head": " fish.head();",
"tail": " fish.tail();",
"rest": " fish.rest();",
"mouthOpen": " fish.setMouth(1);",
"mouthClosed": " fish.setMouth(0);"
}
total_ts = 0
for timestamp, action in actions.items():
total_ts += float(timestamp)
millis = int(float(total_ts)*1000)
code.append(f" delayUntil({millis});")
code.append(action_lines[action])
generated = "\n".join(code)
with open("routine.ino", "w") as file:
file.write("void routine() {\n")
file.write(generated)
file.write("\n}\n")

Give me a second, I have to wash the blood off my hands after murdering every best practice in the book with this move.

To be clear--this method is ugly. It produces an excessively large binary, it stores data in a file that people expect to be for application logic, it adds an unintuitive step to the build process, it's just generally Not The Way Things Are Supposed To Be Done.

But it worked. It kept the movements synched up to the song. At the end of the day, the future maintainability or extensibility of this program isn't something that mattered that much, and it was a worthwhile tradeoff in order to save some valuable time.

This was the second major lesson. Code quality is good, and good habits exist for a reason, but it's important to decide when things are "good enough" for that specific project.

One Last Feature: the volume knob

After putting everything together, I noticed that the fish was perhaps too loud to have in a room. At the last minute, I decided to try adding a volume knob. But how to do it?

The "knob" part was easy enough: I just grabbed some spare parts from my dad's work on repairing guitar amps. On the other hand, the "volume" part presented more of a challenge. Should I go for an analog approach, and try to create a circuit involving the knob that limited the output to the speaker? Or...

I had another idea. What if I used an analog input pin on the Arduino to read the value of the knob, then sent that value to the VS1053 to adjust it's output volume in software? This greatly simplified the wiring, so I went ahead and built it.

Here, I made another tradeoff. I sampled this knob after each movement--not at a regular interval, not at a reasonable sample rate, literally just after each fish movement. And it was good enough! The variable amplitude of the song made the jagged nature of this sampling less obvious.

Results

Compromises were definitely made. The lack of built-in batteries made the project significantly more cumbersome, for one. And the motions weren't completely perfect. That said, I'm pretty proud of making what I did in such a limited turnaround time.

This project taught me when good enough was good enough. I sometimes struggle with perfectionism in my projects, where I'll rewrite the same code over and over again in pursuit of a better way of doing things. This forced me to make the tradeoffs I was too scared to make, and to decide what mattered for the final product and what didn't.

]]>
https://breq.dev/projects/fish /projects/fish Tue, 08 Jun 2021 00:00:00 GMT
<![CDATA[5F3759DF: An explanation of the world's most infamous magic number]]>

Here's a famous function in the Quake III source code. Can you guess what it does?

float Q_rsqrt( float number )
{
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = * ( long * ) &y; // evil floating point bit level hacking
i = 0x5f3759df - ( i >> 1 ); // what the fuck?
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed
return y;
}

This function calculates the inverse of the square root of a number.

Q_rsqrt(number)=1numberQ\_rsqrt(number) = \frac{1}{\sqrt{number}}

Preliminary Explanation

Why?

Quake III needs to perform a lot of calculations to simulate lighting. Each face of a polygon in the scene is represented by a 3D vector pointing perpendicular to it. Then, these vectors are used to calculate the reflections of light off of the face.

Each face could be represented by a vector of any length--the important information is the direction of the vector, not the length of the vector. However, the calculations become much more simple if all of the vectors have length 1. Thus, we need to normalize these vectors. We can do this by dividing the vector by its length (it's norm, v\| \boldsymbol{v} \| ). Let v\boldsymbol{v} denote the vector before normalization and v^\boldsymbol{\hat{v}} denote the normalized vector.

v^=vv\boldsymbol{\hat{v}} = \frac{\boldsymbol{v}}{ \| \boldsymbol{v} \| }

Using the Pythagorean theorem:

v=vx2+vy2+vz2\| \boldsymbol{v} \| = \sqrt{v_x^2 + v_y^2 + v_z^2}
v^=v1vx2+vy2+vz2\boldsymbol{\hat{v}} = \boldsymbol{v} \cdot \frac{1}{\sqrt{v_x^2 + v_y^2 + v_z^2}}

We can then break this down into each component (x, y, and z) of the 3-dimensional vector:

vx^=vx1vx2+vy2+vz2\hat{v_x} = v_x \cdot \frac{1}{\sqrt{v_x^2 + v_y^2 + v_z^2}}
vy^=vy1vx2+vy2+vz2\hat{v_y} = v_y \cdot \frac{1}{\sqrt{v_x^2 + v_y^2 + v_z^2}}
vz^=vz1vx2+vy2+vz2\hat{v_z} = v_z \cdot \frac{1}{\sqrt{v_x^2 + v_y^2 + v_z^2}}

The addition and multiplication are easy enough to execute, but both the division and the square root would be very slow on a Quake III-era CPU. Thus, we need a faster method to return an approximate value of 1x\frac{1}{\sqrt{x}}.

IEEE 754: How Computers Store Fractions

It's pretty straightforward to store an integer in binary by just converting it into base 2:

510=0000010125_{10} = 0000 0101_2

But how do computers store fractions? One simple approach would be to put a decimal point at a fixed position in the number:

510=0101.000025_{10} = 0101 . 0000_2
5.2510=0101.010025.25_{10} = 0101 . 0100_2

This seems to work well, but we've drastically reduced the size of the numbers we can store. We only have half as many bits to store our integer part, which limits us to a relatively small range of numbers, and we only have half as many bits for our fractional part, which doesn't give us a ton of precision.

Instead, we can borrow the idea of scientific notation, which represents numbers with a mantissa and exponent like 5.431035.43 \cdot 10^3. With this, we can store very large values and very small values with a consistently low relative error. This is called "floating point"--unlike the prior idea, the decimal point can effectively "float around" in the bit representation to give us precision at both very small and very large values. Thus, if xx is the number we want to store, we write it in terms of exponent EE and mantissa MM:

x=M2Ex = M \cdot 2^E

At the time Quake III was released, most CPUs were 32-bit. Thus, the floating point representation used 32 bits to store a number. They were allocated as follows:

1 bit8 bits23 bits
signexponentmantissa

The "sign" bit is used to represent whether the number was positive or negative. In this case, since we know the argument to the square root function should always be positive, we can assume it to always be zero. There are also some special cases that happen for special values like NaN, inf, or very small numbers, which we can also ignore for now.

Note that the exponent could be positive or negative. To accommodate this, it is stored with an offset of 127. For example, to store 510=1.01225_{10} = 1.01 \cdot 2^2, we would take the exponent 210=1022_{10} = 10_2 and add 12710=011111112127_{10} = 0111 1111_2 to it to get 1000000121000 0001_2.

In scientific notation, the first digit of the mantissa must be between 1 and 9. In this "binary scientific notation", the first digit must be between 1 and 1. (You can see this in our example: 510=1.01225_{10} = 1.01 \cdot 2^2.) Therefore, we don't actually have to store it.

Interpreting floats as ints

An integer is just represented as a base 2 number. So, what happens if we take a number, find the bits of its floating-point representation, and then interpret those bits as a base 2 integer number?

Recall that our float is stored as:

1 bit8 bits23 bits
signexponent representationmantissa representation

(Note that the least significant bit is on the right.)

Observe that the mantissa bits start in the ones place. However, remember that the mantissa is a fraction, so we've really stored (M1)223(M - 1) \cdot 2^{23}.

The exponent bits, E+127E + 127, start in the 2232^{23} place.

Essentially, we have the sum of the quantity (M1)223(M - 1) \cdot 2^{23} and the quantity (E+127)223(E + 127) \cdot 2^{23}.

Let's define the function I(x)I(x) that represents this bizarre operation of taking a floating-point representation and interpreting it as an integer:

I(x)=223(E+127+M1)I(x) = 2^{23} \cdot (E + 127 + M - 1)

An observation about logarithms

Let's go back to our floating-point representation of xx:

x=M2Ex = M \cdot 2^E

Now, what happens if we take the logarithm? Let's take the log base 2, since we're working with binary. Evaluating log2(x)\log_2(x) directly would be pretty slow, but let's proceed symbolically to see if we get anywhere useful.

log2(x)=log2(M2E)\log_2(x) = \log_2(M \cdot 2^E)
log2(x)=log2(M)+log2(2E)\log_2(x) = \log_2(M) + \log_2(2^E)
log2(x)=log2(M)+E\log_2(x) = \log_2(M) + E

Calculating log2(M)\log_2(M) would be hard. Instead, we can make do with an approximation. Remember that since MM is the mantissa, it will be greater than 1 but less than 2.

There's a conveninent approximation we can use. Here's a graph in Desmos:

The red curve is log2(x)\log_2(x), the function we want to approximate. The purple curve is the line x1x - 1, which is already a pretty good approximation. However, the blue curve is even better: x1+σx - 1 + \sigma, where σ\sigma is tuned for the maximum accuracy around σ=0.0450466\sigma = 0.0450466.

Let's continue, using log2(x)x1+σ\log_2(x) \approx x - 1 + \sigma:

log2(x)=log2(M)+E\log_2(x) = \log_2(M) + E
log2(x)M1+σ+E\log_2(x) \approx M - 1 + \sigma + E

But what does this have to do with I(x)I(x)?

Recall that

I(x)=223(E+127+M1)I(x) = 2^{23} \cdot (E + 127 + M - 1)

We proceed:

I(x)=223(M1+σ+E+127σ)I(x) = 2^{23} \cdot (M - 1 + \sigma + E + 127 - \sigma)

Observe the (M1+σ+E)(M - 1 + \sigma + E) part. Earlier, we found that log2(x)M1+σ+E\log_2(x) \approx M - 1 + \sigma + E. Therefore, we can substitute this in:

I(x)223(log2(x)+127σ)I(x) \approx 2^{23} \cdot (\log_2(x) + 127 - \sigma)

We can solve for log2(x)\log_2(x) as follows:

I(x)223log2(x)+127σ\frac{I(x)}{2^{23}} \approx \log_2(x) + 127 - \sigma
I(x)223127+σlog2(x)\frac{I(x)}{2^{23}} - 127 + \sigma \approx \log_2(x)
log2(x)I(x)223127+σ\log_2(x) \approx \frac{I(x)}{2^{23}} - 127 + \sigma

And there it is: we have a much faster way to compute the logarithm.

Rewrite the problem

Remember, we're trying to find the quantity 1x\frac{1}{\sqrt{x}}.

We can use the properties of logarithms to find:

log21x=log2x12=12log2x\log_2{\frac{1}{\sqrt{x}}} = \log_2{x^{-\frac{1}{2}}} = -\frac{1}{2} \log_2{x}

Now, let's try substituting in our fast logarithm:

log21x=12log2x\log_2{\frac{1}{\sqrt{x}}} = -\frac{1}{2} \log_2{x}
I(1x)223127+σ12[I(x)223127+σ]\frac{I(\frac{1}{\sqrt{x}})}{2^{23}} - 127 + \sigma \approx -\frac{1}{2} \left[ \frac{I(x)}{2^{23}} - 127 + \sigma \right]

We can do some manipulation to solve for our result:

I(1x)223127+σ12[I(x)223127+σ]\frac{I(\frac{1}{\sqrt{x}})}{2^{23}} - 127 + \sigma \approx -\frac{1}{2} \left[ \frac{I(x)}{2^{23}} - 127 + \sigma \right]
I(1x)223(127σ)12I(x)223+12(127σ)\frac{I(\frac{1}{\sqrt{x}})}{2^{23}} - (127 - \sigma) \approx -\frac{1}{2} \cdot \frac{I(x)}{2^{23}} + \frac{1}{2} (127 - \sigma)
I(1x)223(127σ)12I(x)+12(223(127σ))I(\frac{1}{\sqrt{x}}) - 2^{23} (127 - \sigma) \approx -\frac{1}{2} I(x) + \frac{1}{2} (2^{23} (127 - \sigma))
I(1x)12I(x)+32(223(127σ))I(\frac{1}{\sqrt{x}}) \approx -\frac{1}{2} I(x) + \frac{3}{2} (2^{23} (127 - \sigma))
I(1x)32(223(127σ))12I(x)I(\frac{1}{\sqrt{x}}) \approx \frac{3}{2} (2^{23} (127 - \sigma)) - \frac{1}{2} I(x)
1xI1(32(223(127σ))12I(x))\frac{1}{\sqrt{x}} \approx I^{-1}(\frac{3}{2} (2^{23} (127 - \sigma)) - \frac{1}{2} I(x))

The magic number

Let's look at this term:

32(223(127σ))\frac{3}{2} (2^{23} (127 - \sigma))

We know everything here ahead of time. Why don't we go through and calculate it?

σ=0.0450466\sigma = 0.0450466
32(223(1270.0450466))=1.5974630065963008109\frac{3}{2} (2^{23} (127 - 0.0450466)) = 1.5974630065963008 \cdot 10^9
1.597463006596300810915974630071.5974630065963008 \cdot 10^9 \approx 1597463007
159746300710=5F3759DF161597463007_{10} = 5F3759DF_{16}

So there's where that's from.

Anyway, we now have:

1xI1(5F3759DF1612I(x))\frac{1}{\sqrt{x}} \approx I^{-1}(5F3759DF_{16} - \frac{1}{2} I(x))

This is the gist of how the function works. Let's step through the code now.

The Code

Let's go through this code, line by line, to see how it matches up with our mathematical approximation.

float Q_rsqrt( float number )
{
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = * ( long * ) &y; // evil floating point bit level hacking
i = 0x5f3759df - ( i >> 1 ); // what the fuck?
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed
return y;
}

A quick note: the argument to the function is called number, I'll be calling it xx for simplicity's sake.

Evil Floating Point Bit Level Hacking

Let's look at the first part of the function.

float Q_rsqrt( float number )
{
long i;
float y;
...
y = number;
i = * ( long * ) &y; // evil floating point bit level hacking
...
}

We start by declaring a long, which is a 32-bit integer, called i. Then, we declare a float, or a floating-point representation number, y. We store the value of the argument (number, or xx) into y. Simple enough.

The next line, however, is where things get ugly. Starting from the right, let's go step-by-step.

y is, of course, our floating-point number.

&y refers to the reference to y--the location in computer memory at which y is stored. &y is a pointer to a floating-point number.

( long * ) is a cast--it converts a value from one type to another. Here, we're converting &y from "pointer to a floating-point number" to "pointer to a 32-bit integer". This doesn't modify the bits in y at all, it modifies the type of the pointer. It tells the compiler that the value at this pointer isn't a float, it's an int.

i = * [...] will dereference the pointer. It sets i equal to the value at that pointer. Since the pointer is considered a pointer to a 32-bit integer, and i is declared as a 32-bit integer, this just sets i equal to the bits at that location in memory.

Effectively, this part takes the bits in the floating-point representation of the argument (number) and interprets them as a 32-bit integer instead.

Does that sound familiar? It's our I(x)I(x) function we defined earlier! These lines could be written as

yxy \gets x
iI(y)i \gets I(y)

Or, more concisely:

iI(x)i \gets I(x)

What's with all this memory trickery?

You might ask, why can't we just do this?

i = (long) y;

After all, we just want y as an integer, right?

However, this expression will actually convert the value that y represents into an integer. For instance, if y=5.2510=1.010122y = 5.25_{10} = 1.0101 \cdot 2^2, then this code will set i=5i = 5. It will convert y to an integer.

This isn't what we want. We don't want to do any conversion -- we are literally taking the bit representation of y and interpreting it as an integer instead. Thus, we convert the pointer instead, which doesn't actually modify the bits stored in memory.

You might think that the code looks ugly. That's because it is--casting a pointer from one type to another is considered "undefined behavior," and the C standard does not guarantee what will happen. We're basically tricking the compiler here. This is definitely considered bad practice, which is why it's "Evil Floating Point Bit Level Hacking". Evil because it's relying on undefined behavior, Floating Point because we're working with a floating-point representation, Bit Level because we're directly using the bit representation as an integer, and Hacking because this is most definitely not the way casting is intended to be used.

The "WTF" Line

float Q_rsqrt( float number )
{
...
i = 0x5f3759df - ( i >> 1 ); // what the fuck?
...
}

If you aren't familiar with bitwise operations, the symbol ( i >> 1 ) might seem strange to you.

Think about doing a division problem like 13560÷313560 \div 3. It would be pretty tough, right? You'd probably need to write out the entire long division problem.

On the other hand, think about finding 13560÷1013560 \div 10. You can pretty quickly tell that the result is 13561356, right? All you had to do was shift all the digits one place to the right.

We can use the same trick in binary. To divide a number by 2, we just need to shift each bit to the right. That is what >> 1 does--it's just a much faster way to divide by two.

Remember, in the last step, we set iI(x)i \gets I(x). Thus, ( i >> 1 ) is 12I(x)\frac{1}{2} I(x).

We then subtract this from our magic number:

i5F3759DF1612I(x)i \gets 5F3759DF_{16} - \frac{1}{2} I(x)

Finishing up

float Q_rsqrt( float number )
{
...
y = * ( float * ) &i;
...
}

This looks very similar to the i = * ( long * ) &y; line we looked at earlier, and that's because it is. However, instead of interpreting the floating-point representation y as an integer, we're interpreting the integer i as a floating-point representation. You can think of this as our I1(x)I^{-1}(x) function.

This step performs yI1(i)y \gets I^{-1}(i). Since the last step set i5F3759DF1612I(x)i \gets 5F3759DF_{16} - \frac{1}{2} I(x), this effectively sets:

yI1(5F3759DF1612I(x))y \gets I^{-1}(5F3759DF_{16} - \frac{1}{2} I(x))

There we go!

But wait, there's more: Newton's method

float Q_rsqrt( float number )
{
...
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
...
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed
...
}

What does this do?

This line performs "Newton's method", which is a method of refining an approximation for the root of a function.

Let's define f(t)=1t2xf(t) = \frac{1}{t^2} - x. Notice that when t=1xt = \frac{1}{\sqrt{x}}, f(t)=0f(t) = 0. Therefore, we can use a root-finding algorithm to try to find this root of f(t)f(t), and we'll get back a better approximation for t=1xt = \frac{1}{\sqrt{x}}.

Here's a graph that shows how Newton's method works:

Note: since we're working with both arguments to Q_rsqrt(x)Q\_rsqrt(x) and arguments to f(t)f(t), I've decided to stick with using tt for the latter. Generally, when working with Newton's method, this value would be called xx.

We have our function in red, and an initial guess in dotted black, and we're trying to find the point at which the function crosses the t-axis. We can draw a line tangent to the function at our initial guess (in green) and then find the intersection of that line with the t-axis to get an even better guess. We can keep doing this until we're happy with the precision.

So, we have our initial guess given by our I1(5F3759DF1612I(x)I^{-1}(5F3759DF_{16} - \frac{1}{2} I(x) expression. Let's call this guess t0t_0. How do we draw a tangent line?

Remember, the point-slope form for a line is given by y=m(tt0)+y0y = m(t - t_0) + y_0. Thus, we can just plug in our initial point (t0,f(t0))(t_0, f(t_0)) to get y=m(tt0)+f(t0)y = m(t - t_0) + f(t_0).

To get the slope mm, we need to take the derivative of the function f(t)f(t). Let's do this later--just call it f(t)f'(t) for now.

y=f(t)(tt0)+f(t0)y = f'(t) \cdot (t - t_0) + f(t_0)

We're trying to find the point where this tangent line crosses the t-axis, that is, where y=0y = 0. Substitute 00 for yy:

0=f(t)(tt0)+f(t0)0 = f'(t) \cdot (t - t_0) + f(t_0)
f(t0)=f(t)(tt0)f(t_0) = - f'(t) \cdot (t - t_0)
f(t0)f(t0)=tt0-\frac{f(t_0)}{f'(t_0)} = t - t_0
t=t0f(t0)f(t0)t = t_0 - \frac{f(t_0)}{f'(t_0)}

And we've arrived at the formula for Newton's method.

Let's substitute in our f(t)f(t) and f(t)f'(t). First, we need to find the derivative:

f(t)=ddt[1t2x]=ddtt2=2t3f'(t) = \frac{d}{dt} \Big[ \frac{1}{t^2} - x \Big] = \frac{d}{dt} t^{-2} = -2t^{-3}

Now, proceed:

t=t0f(t0)f(t0)t = t_0 - \frac{f(t_0)}{f'(t_0)}
t=t01t02x2t03t = t_0 - \frac{\frac{1}{t_0^2} - x}{-2t_0^{-3}}
t=t0(1t02x)t032t = t_0 - \frac{(\frac{1}{t_0^2} - x)t_0^3}{-2}
t=t0+t0t03x2t = t_0 + \frac{t_0 - t_0^3x}{2}
t=2t0+t0t03x2t = \frac{2t_0 + t_0 - t_0^3x}{2}
t=3t0t03x2t = \frac{3t_0 - t_0^3x}{2}
t=t0(3t02x)2t = \frac{t_0(3 - t_0^2x)}{2}
t=t03t02x2t = t_0 \cdot \frac{3 - t_0^2x}{2}
t=t0(32t02x2)t = t_0 ( \frac{3}{2} - \frac{t_0^2x}{2} )
t=t0(32x2t02)t = t_0 ( \frac{3}{2} - \frac{x}{2} t_0^2 )

Let's look at that line of code again:

float Q_rsqrt( float number )
{
...
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
...
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed
...
}

Earlier in the code, there is a line that performs x2x0.5x_2 \gets x \cdot 0.5. This is just so that x2\frac{x}{2} doesn't have to be recalculated on each iteration (even though the second iteration was since removed). Note that we multiply by 0.5 instead of dividing by 2 because multiplication is faster than division. Also, we can't do the bit-shifting trick here since xx is a floating-point number, not an integer.

The Newton's iteration lines do:

yy(32(x2yy))y \gets y (\frac{3}{2} - ( x_2 \cdot y \cdot y ))

which is equivalent to

yy(32x2y2)y \gets y (\frac{3}{2} - \frac{x}{2} y^2 )

which is exactly the expression for Newton's method we found earlier.

This line of code can be repeated to get better and better approximations. However, it appears the authors of Quake III decided that only one iteration was necessary, since the second one was removed.

The End

float Q_rsqrt( float number )
{
...
return y;
}

I'll end with a quote from a relevant xkcd:

Some engineer out there has solved P=NP and it's locked up in an electric eggbeater calibration routine. For every 0x5f375a86 we learn about, there are thousands we never see.

(It looks like the constant Randall mentioned was based on σ=0.0430357\sigma = 0.0430357, not σ=0.0450466\sigma = 0.0450466.)

]]>
https://breq.dev/2021/03/17/5F3759DF /2021/03/17/5F3759DF Wed, 17 Mar 2021 00:00:00 GMT
<![CDATA[How I Use Dokku]]>

As I've started to explore cloud and microservices-based projects, I've turned to Dokku to host and manage my running projects. It's become more or less my one-stop-shop for hosting all the projects I've worked on. A lot of my projects have different requirements, though, so I wanted to share the techniques and setup I use to keep everything running smoothly. This isn't an exhaustive list of all the projects I host, but it chronicles the difficulties I've had over time.

This isn't going to be a How-To guide for Dokku--there are plenty of those already. It's more of a collection of tips and tricks, and an explanation of which features I find most useful.

What is Dokku?

With the growth in popularity of microservices, there has also been a growth in "Platform-as-a-Service" providers like Heroku. These platforms abstract away the details of running a Linux VM and installing dependencies, which drastically simplifies the process of deploying apps. However, these services can be expensive--for instance, once you burn through Heroku's free tier, you'll be charged $7 per month per container. This might be a fair price for businesses who need solid reliability and high performance, but it's prohibitively expensive for a tinkerer who just wants to try out a new technology or work on a side project every so often.

Dokku is a self-hosted platform-as-a-service, and it can be installed on any Linux machine you have access to. It's limited to only one machine, so all of your apps will need to fight over the system's CPU/ram/etc, and there's no easy way to scale across multiple servers. However, renting a single server is much cheaper than running many different containers on a retail PaaS. At time of writing, I'm running about 15 different projects, which would cost about 7×15=7×15 = **105** every month. That's a lot more expensive than a $5 VPS.

The Server

My Dokku instance is currently being rented from Azure with credits I've been getting for free (it's a long story). I run Ubuntu LTS because it's stable, popular, and what I'm familiar with. I just used Dokku's bootstrap.sh script to get started. Their docs have a good installation guide.

I use Dokku's letsencrypt plugin to manage HTTPS for all the apps. Handling this at the platform layer instead of the application layer makes applications easier to develop--I don't have to worry about encryption for every single app, I can just configure it once in Dokku. Encryption is a necessity for me, as my website runs on a .dev domain and is thus on the HSTS preload list by default. (I'm glad that Google did this, and I'm happy to be part of the push for HTTPS adoption everywhere, but boy is working around it frustrating sometimes.)

Snowflake

I've written about Snowflake here before, but here's the TL;DR: It's a service for generating unique, time-ordered, 64-bit ID numbers. It's built with Python and Flask.

Notably, it's segmented into two distinct parts: "Snowflake" generates the IDs based on its instance number, and "Snowcloud" assigns the instance numbers to the Snowflake instances. This is for scalability reasons: it's possible to scale the Snowflake app to many different instances, but as long as each has a unique instance number, the IDs will not conflict. Additionally, Redis is used to keep track of the in-use instance numbers.

Deploying Flask apps to Dokku is easy--no special buildpacks are required, and the Procfile is simply web: gunicorn app:app. Adding Redis is also pretty straightforward using Dokku's official Redis plugin. When you create a Redis instance and link it to your app, Dokku will set the REDIS_URL environment variable in your app to point to that Redis instance. For testing locally, I just put export REDIS_URL=redis://localhost:6379 in my .env so that my test and production environments are similar.

The Snowcloud app will only assign instance IDs to processes with the proper API key. Thus, this key needs to be set in both the Snowcloud and Snowflake app. The dokku config command is your friend here: I just set them as environment variables using dokku config:set SNOWCLOUD-KEY={key}. Being able to set secrets like this in environment variables is really handy--it's great to keep them out of the repo (especially because I like to share my code on my GitHub), and it's a lot easier than trying to store them in a file or something.

AutoRedditor

AutoRedditor is a project I made to quickly return random Reddit posts on demand. It has two main parts: a worker thread to retrieve Reddit posts and store them in Redis, and a web thread to retrieve posts from Redis when a request is received. It's built with Python, but using asyncio, mostly because I wanted an excuse to learn the technology.

Running the worker thread was as simple as adding worker: python3 worker.py to the Procfile. However, running the web thread was a bit more difficult, as I'm using Quart instead of Flask here. Typically, deploying a Quart app is just as simple as replacing Gunicorn with Hypercorn, but it's a bit trickier here. Dokku apps need to respect the PORT environment variable and make their web interface available on that port. The solution is to explicitly specify the port variable in the command: web: hypercorn -b 0.0.0.0:${PORT} app:app.

Cards

Cards is a service to generate image or HTML-based cards for embedding into websites. It uses Pyppeteer, a Python port of the popular Puppeteer JS library used for automating actions in a headless web browser. Specifically, Pyppeteer is used to render and screenshot HTML-based cards in order to produce images. (I experimented with using Selenium, but I found Pyppeteer was easier to install and use.) Because Pyppeteer is based on asyncio, I decided to go with Quart (+ Hypercorn) for this project as well. Getting Pyppeteer to run in an app container isn't particularly straightforward, as it requires a Chromium install to use for the browser.

Thankfully, Heroku has published a Google Chrome buildpack that can be used to install Chrome into the app. Running dokku buildpacks:add cards https://github.com/heroku/heroku-buildpack-google-chrome and dokku buildpacks:add cards https://github.com/heroku/heroku-buildpack-python will configure the app to install both Chrome and a Python runtime. Heroku's buildpack sets the GOOGLE_CHROME_SHIM environment variable, and this just has to be passed to Pyppeteer's launch(executablePath) function. For local testing, leave this variable unset, and Pyppeteer will just use your local Chrome install.

Cards will also cache the screenshots so that it doesn't have to run Pyppeteer for every request. To do this, I needed to mount some sort of persistent storage to the running container. The dokku docker-options command was perfect for this: I just needed to add -v /home/breq/cards:/storage to the deploy options.

Emoji

I made a simple emoji keyboard at emoji.breq.dev because I was frustrated that I couldn't send emoji from my computer with Google Voice. I used Jekyll, which is overkill for a single-page site, but I wanted to use my existing website theme and avoid repeating the same code over and over for every single emoji. I wanted to keep my built _site folder out of the repo to avoid cluttering things up, which complicated Dokku deployment a bit.

I needed to find a way to make Dokku build the site when I deployed. I found some buildpacks that worked: Heroku's nginx pack and inket's Jekyll pack. These were surprisingly painless--the Jekyll pack tells Dokku how to install Ruby, run Jekyll to build the site, and point nginx to the _site folder.

Breqbot

Breqbot is a Discord bot I built. It uses the traditional Gateway API instead of the newer Interactions one, so most functions are handled by a worker thread that connects to Discord over a WebSocket. However, some information is available over a REST API, so there's a web thread that runs as well.

The most difficult part about getting Breqbot to work was the voice features. Breqbot includes a soundboard feature to play sound in a Discord voice channel. Handling these audio codecs requires installing quite a few packages: I ended up using Heroku's Apt buildpack to install libffi-dev, libnacl-dev, and libopus-dev, and I used jonathanong's ffmpeg buildpack for, well, ffmpeg.

Minecraft

I wanted to host a Minecraft server on my cloud VPS as well, so I decided to try to use Dokku to manage it. So far, it's been working out pretty well, and it's nice to have everything managed in one place. However, getting Minecraft to work initially in a containerized setup wasn't straightforward.

I found itzg's Docker Minecraft server image and set about making it work with Dokku. Dokku does support deploying from a Docker image, and although the process isn't particularly straightforward, it is at least well documented.

Next, I set the docker options:

  • -e TYPE=PAPER, to run a high-performance Paper server instead of the default vanilla one
  • -p 25565:25565, to expose the Minecraft server port to the Dokku host's public IP address
  • -v /home/breq/minecraft:/data, to configure a persistant place to store the world data, plugins, datapacks, etc

Finally, I needed a way to access the server console to run commands while the server is running. I found mesacarlos' WebConsole plugin, which can provide a password-protected console over the Internet. To expose this console, I used dokku proxy to proxy ports 80 and 443 on the host to port 8080 inside the container. I'm currently hosting the web interface in a separate Dokku app. I just made sure to set the WebConsole port to 443 in the interface to connect to the container using HTTPS.

Syncthing

I run a Syncthing instance to sync files between my desktop and laptop, so I decided to try to run this through Dokku as well. Syncthing provides an official Docker image, so I didn't have to use a third-party one, and they have good documentation as well. The only change I made from the guide was to proxy the web GUI through Dokku with dokku proxy:ports-add syncthing https:443:8384 instead of exposing it directly to the Internet.

Wireguard

I also decided to run a VPN, and I chose Wireguard because it seemed simple, well-supported, and lightweight. The LinuxServer.io team maintains a Wireguard image, so I just needed to deploy it to a Dokku app. Using the documentation as a reference, I set these docker options:

  • --cap-add=NET_ADMIN --cap-add=SYS_MODULE as Wireguard runs as a Linux kernel module, which can't be containerized, so access to the kernel modules and network have to be granted
  • -e TZ=America/New_York to set the timezone
  • -e SERVERURL=vpn.breq.dev to set the server URL
  • -p 51820:51820/udp to expose the VPN port to the Internet
  • -v /home/breq/vpn:/config to mount the configuration files as a volume--this lets you grab the config files to distribute to the peer computers
  • -v /lib/modules:/lib/modules to mount the kernel modules directory, as Wireguard runs as a Linux kernel module
  • --sysctl=net.ipv4.conf.all.src_valid_mark=1 to allow routing all traffic through the VPN
  • -e PEERS=breq-desk,breq-laptop,breq-phone to define the peers I want to connect

Conclusion

Running everything all in one place has really simplified things a lot. Being able to deploy apps quickly is really nice for prototyping ideas. Overall, I'm really glad I decided to start using Dokku to manage all the services I'm hosting in the cloud.

]]>
https://breq.dev/2021/02/10/dokku /2021/02/10/dokku Wed, 10 Feb 2021 00:00:00 GMT
<![CDATA[Snowflake]]>

Overview

This is a service to generate Snowflake IDs. These are unique, time-ordered IDs, useful for things like instant messages or posts.

Motivation

I wanted to get more experience working with the 12 Factor App model. Implementing a basic service like this seemed like the easiest way to go about it. I also figured this kind of service could be useful for me at some point.

Technical Description

The "Snowflake" ID format was designed at Twitter for identifying tweets, so its scalability is its main selling point. Each Snowflake is a 64-bit integer composed of a timestamp in milliseconds (42 bits), worker ID (10 bits), and increment (12 bits).

The format can support 1024 possible worker IDs -- too many to hardcode by hand, but too few to assign randomly. So, I made a second app, "SnowCloud," that handles the assignment of these worker IDs to the Snowflake server instances. When a Snowflake server comes online, it will request a worker ID from a SnowCloud server. Then, it will periodically renew the worker ID with the SnowCloud server. The pool of worker IDs are stored as a Redis set, sorted by how recently each was used. In addition, each Snowflake server is identified by a UUID, and these UUIDs are tracked to ensure Snowflake servers can only renew their own assignments. (For reference, the SnowCloud repo is at, predictably, breqdev/snowcloud.)

Results

It does everything I set out to accomplish, and I got more experience developing microservices, which was cool. This was just a quick afternoon project--it was cool to go from idea to finished product in just a couple hours.

]]>
https://breq.dev/projects/snowflake /projects/snowflake Fri, 04 Dec 2020 00:00:00 GMT
<![CDATA[STMusic]]>

Note that unlike other projects on this site, I wrote this for a homework assignment. That being said, adding in the speaker and music were things I did because I wanted to have fun, not to meet the assignment requirements.

Here's a video I recorded demonstrating the game.

Overview

This game is similar to Guitar Hero or other rhythm-based games. The player should push the "USER" button whenever one of the scrolling indicators reaches the left of the screen. This was the first major project I made using C. The biggest challenge in this project was figuring out how to store and generate the sounds that made up each song.

Motivation

This was a homework assignment in my Computer Organization class to help us learn how to use bitwise operations to display a game on the ST Discovery's LCD. I chose to add sound functionality mostly because I thought it would be an interesting challenge, I was interested in how basic microcontrollers generate different sounds, and I figured a silent game would be rather boring.

Technical Description

Sound Generation

The STM32 microcontroller this project used doesn't have any purpose-built hardware for generating sounds (that I'm aware of). So, the solution I settled on was to manually generate a square wave by setting a GPIO pin high, waiting for half the length of the waveform, setting it low, and waiting for the rest of the waveform.

The biggest hurdle with this approach was accurate timing. The STM32 can use interrupts to delay for a precise number of milliseconds, but generating square waves at specific frequencies requires sub-millisecond precision. The solution I came up with was to calibrate a busy-wait loop when the code begins using the millisecond timer, then use that busy-wait loop for sub-millisecond-precision delays.

This yielded a decent-sounding square wave, but the game audio still felt incomplete.

I attempted to play multiple notes at once by summing the square waves, but the result did not sound very good. Additionally, the timing code required to play two separate frequencies at once quickly became complicated. Perhaps I could have used two separate GPIO pins and a voltage divider to effectively increase the bit depth (allowing for 4 separate voltage levels to be sent to the speaker).

Instead of attempting that, I decided to try adding drum sounds. By playing each drum sound and then quickly switching to playing the melodic note, the device can give the illusion that both sounds are playing at once. This didn't work out as well as I had hoped, but it sounded okay at least.

For the kick drum, I borrowed a trick used frequently by composers for the NES: By doing a rapid downward pitch bend in between melodic sections, it's possible to fake a kick drum sound somewhat convincingly. Because I don't have the luxury of a triangle-wave channel, this doesn't sound as good in my project as it does in NES games, but the trick still works.

For the snare drum, I decided to just use a burst of white noise. But as the STM32 doesn't have any built-in random number generation, I had to choose a pseudorandom algorithm to implement.

At first, I tried to use a Linear Congruential Generator, because it seemed easier to implement. However, with the parameters I chose, the period was small enough that I could hear a distinct tone in the output. I could have probably eliminated this by choosing better parameters, but I didn't want to spent a bunch of time tuning the parameters.

I then looked into using a Mersenne Twister, because it seemed like a popular choice. I ultimately decided it against it as it seemed hard to implement. I also worried that it might be too slow, considering I'd want to be sending bits to the GPIO pin as fast as possible to ensure the snare sound had enough high-end frequencies.

Finally, I settled on XorShift, which was fast and had a basic implementation.

Data Packing

After figuring out how to synthesize the song, I needed to figure out how to store it. The trial version of CrossWorks Studio that I was using restricted me to a 16kB code size. I initially wanted to include multiple long songs (although I later scrapped this due to time constraints), so I needed to find an efficient way to store each note, drum, and indicator on the screen.

I decided early on to try to fit the information for each beat into a small integer and store these integers in an array. I looked into what information I would need to store:

  • Note pitch (7 bits when stored as MIDI note number)
  • Drum sound (2 bits - kick, drum, or neither)
  • Indicator (1 bit)

To store each note pitch, I decided to use MIDI note numbers. These only use 7 bits per note, and they can be converted to frequencies using a basic formula, so this was a much better solution than trying to store the note frequency or wavelength.

10 bits is kind of an odd size, so I tried to figure out what else I could include to use all bits in a 16 bit integer. The first thing I added was duty cycle controls. The original NES had 3 duty cycle settings, and composers could create interesting effects by switching between them. I decided to add 4 duty cycle settings to this project, although they didn't sound as different as I had hoped (likely due to the poor quality speaker I used). This brought the total up to 12 bits.

Finally, I came up with the idea of including a "message length" field which would specify how many beats after this one were to be held out. This could drastically compress the resulting array by removing duplicate entries. I made this 4 bits long, allowing for up to 16-beat messages.

Here's the spreadsheet I built to pack these messages together for me. On the left, you can set the parameters of each note: is a marker shown on the display? what specific note is played? for how long? is there a drum sound? etc. On the right is each section packed into a single 16-bit integer. An array of these integers can be included with the game code to play back the song. By switching the tab at the bottom, you can see both of the songs I included with the game.

Results

I had fun working on this, and I learned a lot about programming for embedded systems. The results didn't sound spectacular, but they still enhanced the final game in my opinion.

]]>
https://breq.dev/projects/stmusic /projects/stmusic Fri, 09 Oct 2020 00:00:00 GMT
<![CDATA[BlockChat]]>

Description

BlockChat is a simple decentralized chatroom application that uses a blockchain to store the message data.

Motivation

This project was something I quickly threw together in order to better understand blockchain while the Bitcoin boom was taking place. I couldn't find a resource that explained proof-of-work, consensus, and other blockchain-related topics well enough for me to understand them, so I decided to try making my own blockchain application, loosely based around a couple of examples I saw around the Internet.

Technical Description

At its core, a blockchain is a distributed, public ledger, so I decided that the easiest blockchain application to write would be a chatroom. At the time, it seemed much easier than a digital token or currency system.

I started by implementing the blockchain data structure. As the name implies, this is a chain of blocks, where each new block contains the cryptographic hash of the previous block, so that old data cannot be modified without also re-writing all of the new data.

Then, I moved on to the proof-of-work algorithm. This algorithm is designed to be computationally expensive so that new blocks take a significant amount of time to "mine". Because of this, if somebody wanted to re-write a previous part of the blockchain, it would take a prohibitively large amount of computing power to then re-write all of the blocks that follow.

For my proof-of-work algorithm, I decided to require finding an integer with a hash that starts with a certain number. Solutions to this are difficult to find (as miners have to blindly guess at what the number is) but easy to verify (just hash the potential number and see if the first digits match). To make sure no miner tries to re-use the same solution multiple times, I required each new solution to be a larger number than the previous one.

I didn't put too much effort into optimizing my miner implementation - it only uses a single CPU thread, sequentially searching numbers. In an actual blockchain scenario, this isn't a good choice, as anyone who could write a better miner (by using multiple CPU cores, GPUs, or even FPGAs) would then have much more computational power compared to the rest of the network. I also didn't make the difficulty of the proof-of-work change in response to miner availability. This also isn't a good idea for a real blockchain, as periods of low miner activity may not be able to mine new blocks quickly. Worse, if many more miners decided to mine my blockchain, a too-easy proof-of-work would make it easier for malicious miners to modify the blockchain.

Next, I worked on the consensus algorithm. For a node to accept new blocks, it first needs to verify the proof-of-work values and hashes to ensure the chain it receives is valid. Assuming it is, the node needs to choose which version of the chain it should prefer. In my implementation, the node will simply choose the longest version.

Finally, I needed to devise the protocol which would allow nodes to add messages to the blockchain and compare versions with other nodes. I decided to use a REST API, because it was the easiest solution.

Results

In the end, the program did work. However, there were a few corners I cut that made the end result kind of impractical.

The first issue was that I did not include distinction between nodes and miners. In actual blockchains, nodes that want to put data on the blockchain will send that data to many different miners, so that their data would likely be included in the next block, regardless of which miner mined it. In my demo, in order for anybody to send a message to the chatroom, they have to be the one to mine the next block. If the blockchain is scaled beyond just a few miners, the chances of this happening would be nearly zero, so actually using the chatroom would be almost impossible.

The second issue is the lack of reward for miners. Initially, this was one of the things I didn't intend to include, because I wanted to keep things simple. But because the miners have no incentive to mine blocks, it would be difficult to get benevolent miners to participate in the blockchain, and it would be easy for "evil" miners to gain enough of a share of the computing power to undermine the stability provided by proof-of-work.

While the end result wasn't something practical to deploy, working on this project definitely helped me understand the underlying design of blockchain platforms. It also answered many questions I had about the technology, such as "Why do non-currency applications like Namecoin still have tradeable tokens?" and "How exactly does proof-of-work ensure the stability of the blockchain system?", so I would definitely consider it a success.

]]>
https://breq.dev/projects/blockchat /projects/blockchat Fri, 09 Oct 2020 00:00:00 GMT
<![CDATA[Pinewood Derby Car]]>

A quick video of the car in action.

Motivation

When I was a smol child in cub scouts, we had a competition to see who could build the fastest car. I didn't really care about that, so I decided to try and make something cool instead using an Arduino I had recently gotten.

Technical Description

Almost everything is wired directly up to the Arduino, except the LED matrix, which runs off of the I2C bus. I modified some example code to display an animation of flames scrolling across (that I had drawn up on some graph paper).

Results

I think I won an award for it? (To be honest it's been so long that I can't remember well). But I do remember the battery dying - I had used a small garage-door-opener battery in order to fit within the weight limit, and that ended up dying pretty quickly.

]]>
https://breq.dev/projects/pinewood-derby-car /projects/pinewood-derby-car Fri, 09 Oct 2020 00:00:00 GMT
<![CDATA[Breqbot]]>

One of the many "just for fun" commands that Breqbot has.

A previous version of this post had a link to invite the bot to your server. I have removed this link for a few reasons:

  1. The bot was built for a small set of friends half a decade ago when myself and my friend group both were very different from how they are now
  2. I have no interest or motivation in keeping this bot running
  3. While I remain proud of the work I put into this bot, it provides no particular utility compared to other bots out there

The GitHub repo will remain accessible.

Overview

Breqbot is a Discord bot that manages a virtual economy, provides several fun minigames, gives access to comics, and has a variety of other features. Users can add Breqbot to a Discord server with their friends, giving their community access to these features.

Motivation

A couple things came together to make this project happen. First, one of my friends invited me to a Discord server where they were using a variety of other popular Discord bots. I wasn't a fan of how some of these bots ran their economy or other features, and I wanted to see if I could implement something better. Second, my work on McStatus's backend using Heroku gave me experience working with microservice architectures, and I wanted to work on a more complex microservices project.

Technical Description

A running Breqbot instance consists of three containerized processes: the web process, the Discord worker process, and the Redis instance.

The Discord Worker

The discord worker is written using discord.py using its bot commands framework. This process listens for Discord events over the Discord Gateway, selects an appropriate command, and executes it. Here are some examples of these commands:

By adding reactions to the message (the arrow emojis), members of the Discord server can work together to play the famous browser game "2048." Breqbot will listen for these reactions, modify the internal game state, and update the game board displayed.

Breqbot allows a user to configure their profile, which can then be displayed on request by other users. These profiles are images drawn using Pillow, a Python library for image manipulation.

Breqbot can pull Vex Robotics Competition data from VexDB and create a summary of a team's performance. Here's my team from 2019-2020.

Breqbot can share comics from a few series including the famous xkcd. Additionally, a worker task will continuously monitor these comics for updates, and it can be configured to automatically post them to certain channels.

The Soundboard app lets server members add sounds (in the form of YouTube links) and corresponding emoji. Then, if someone reacts to the soundboard using that emoji, the corresponding sound will play in the server's voice channel.

Discord has a powerful "Roles" feature to help identify members of a server, but it does not have a way for users to assign themselves roles. Breqbot provides a "reaction roles" menu system - by reacting to this message, a user can select which roles they would like to receive, and Breqbot will automatically modify their roles as necessary.

The Redis Instance

Most of Breqbot's features rely on Redis to store user data. For example, the role menu will store the set of messages it needs to watch reactions for, and the profile feature will store a user's configuration in Redis. This allows the Discord worker to be restarted at any time with minimal disruption.

One notable use of Redis is to cache Reddit posts. Breqbot uses PRAW to automatically retrieve popular posts from Reddit to display. However, the Reddit API is slow, and it does not provide a method to choose a random popular post. Because of this, Breqbot uses a background task to periodically retrieve the 100 most popular posts from a variety of subreddits and store them in Redis. Then, when a user requests a post from one of these subreddits, Breqbot can retrieve it from its cache.

The Web Process

The web process runs Breqbot's accompanying website. If a Discord server chooses to enable it, Breqbot can publish information about that server's economy and its members to a website URL. The web process will use Redis to get this information - it does not communicate directly with the Discord worker.

The Portal API allows other programs to connect to a Breqbot instance to provide additional functionality. In this example, the Portal (which was hosted on my laptop) will echo back any input it receives. A more practical use of this would be to build functionality that interacts with the real world, such as a remote-control robot that communicates over Discord.

Portal clients connect to Breqbot using WebSockets, which are handled by the web process. Requests and responses are sent between the Discord and Web processes through a Redis pub/sub channel. Management of Portal clients, including distribution of API keys, is handled by the Discord worker: if a user wants to register a new portal, they will receive their API keys through a direct message on Discord.

Results

This was my first project that ended up being used by such a wide audience: many of my friends added it to their communities on Discord, and people were constantly requesting functionality or finding bugs. I really enjoyed the experience of developing this and using it with my friends, though, and it was rewarding to see other people enjoying my work.

]]>
https://breq.dev/projects/breqbot /projects/breqbot Thu, 08 Oct 2020 00:00:00 GMT
<![CDATA[LPS System]]>

A quick demonstration of the system in action.

Overview

The LPS (Local Positioning System) is a system to help robots determine their location in a predefined area. This system uses a camera and colored markers to determine the position of robots and other objects in a scene, then relays that information to robots over a network connection.

Motivation

Localization in robotics is a difficult but important problem. Through the Vex Robotics Competition, I’ve had experience trying to get a robot to perform pre-programmed movements. I’ve also worked on several automated robots outside of VRC. In all cases, I’ve been dissatisfied with the poor options available for determining a robot’s position.

Current Systems

Accelerometer

Because an accelerometer measures acceleration, it must be double-integrated to track the robot’s position. This means that any error will drastically affect the result over time, and the calculated position will rapidly lose precision.

Encoder Wheels

By placing encoders on wheels, a robot can track its movement over time. VRC team 5225A, “πlons,” used a technique with three separate encoders to track the robot’s movement, its slip sideways, and its orientation (described here). However, any error (wheel slop, robot getting bumped or tipping, etc) will accumulate over time with this method. πlons needed to supplement their tracking system by occasionally driving into the corners of the VRC field in order to reset the accumulated error.

Global Positioning System

GPS can work well for tracking a robot’s general position in a large area. However, common receivers have an accuracy of only about 3 meters, and while modules with centimeter-level accuracy are available, they are prohibitively expensive.

The "Vex GPS": An example of the "Pattern Wall" approach to localization.

Pattern Wall

By placing a specific code along the walls of the area, a camera mounted on a robot can determine its position and orientation. This method will be used in the Vex AI Competition. Other objects in the field can block the robot’s view of the pattern wall. Additionally, this system requires a defined playing field with straight sides and walls, and each robot is required to have its own camera and computer vision processing onboard.

LiDAR

A LiDAR sensor can create a point cloud of a robot’s surroundings, and SLAM (Simultaneous Localization And Mapping) algorithms can automatically map an area and determine the robot’s position relative to other objects. However, LiDAR sensors are large and expensive, they must be mounted with a clear view of the robot’s surroundings, and the robot must have enough processing power to run these algorithms. Additionally, this system will not work as well in an open environment, where there won’t be any objects for the LiDAR to detect, and it might have difficulty determining the difference between stationary objects (such as walls) and moving objects (such as other robots in the area).

Radio trilateration

If a robot is able to determine the relative signal strength of multiple different radios (such as WiFi access points or Bluetooth beacons), and it knows the positions of these radios, it can estimate its position relative to them by assuming signal strength correlates with distance. Because there are many other variables that affect signal strength, such as objects between the robot and radio, this method is not very reliable. Also, robots that use wireless communication will need to make sure their transmissions do not affect their ability to determine their location. This problem is worse if multiple robots are present in an area.

New System

In the past, I have more or less given up on systems which do not accumulate error and instead programmed the robot to navigate using an open-loop system and then drive into a wall (of known position) every so often. As this solution doesn’t work well outside of a clearly-defined environment, I wanted to see if I could come up with a better solution for localization.

I set the following criteria for such a system:

  • Work well in scenarios where multiple robots are present
  • Not require complex processing on each robot
  • Be simple to set up in a given area (not requiring calibration)
  • Be relatively inexpensive and use commonly available parts
  • Not accumulate error over time

I decided to primarily target a scenario with multiple tracked objects in a single area. Putting the heavy processing outside of the robot allows for more inexpensive robots powered by basic microcontrollers. Additionally, this allows passive objects in the scene to be tracked, allowing robots to interact with them.

I decided to try implementing a solution that worked outside the plane that the robots would be in, so that it would not be affected by adding more robots or objects into the area. Cameras are common and relatively cheap, so using a camera mounted on a tripod and pointed down at robots seemed like the best approach.

Technical Description

1. Image Scanning

The system starts with a picture of the scene (left image). I’ve placed a computer vision marker in the scene along with a bunch of random objects.

Markers have a blue border around them to distinguish them from the environment. I used painters’ tape for this, as it has a consistent hue in a variety of lighting conditions. The system selects all the pixels in the image with this hue (and ample lightness and saturation).

This mask is shown in the right image. The marker is plainly visible here, and almost all of the other objects in the scene have been filtered out. The pencil and textbook happen to have the same hue, but they will be filtered out later.

The first step in handling the masked areas is to determine their boundary. OpenCV has a convenient function for tracing the border of each region of interest.

Now, we have a series of points tracing the boundary of the marker. This isn’t a perfect quadrilateral for a variety of reasons (the camera introduces noise to the image, the lens introduces distortion, the marker isn’t completely square, the image resolution is low so the image is a bit blurry, etc). To approximate this boundary to four points, with one at each corner, we can use the Douglas-Peucker algorithm (conveniently also implemented by OpenCV).

We can also do some filtering here to discard areas that aren’t markers. For instance, areas that are very elongated, aren’t quadrilaterals, or are very small probably aren’t valid markers. This leaves us with the marker in the image, surrounded by a red bounding box.

2. Transformation Calculation

The next step is to figure out what type of marker this is, and what its orientation is. Because there might be multiple markers in a scene, the black and white squares on the inside of the blue border give information on what type of marker it is.

Using the bounding quadrilateral of the marker, we can calculate where we expect the squares to be. To do this, we need to calculate the transformation between the basis of the picture (i.e., with the origin at the top-left pixel, x-axis as the top row of pixels, and y-axis as the left row of pixels) and the basis of the scene. For reading the squares, it’s convenient to set the origin of the scene at the center of the marker, choose x and y axes arbitrarily, and set the length of each square to 1 unit.

It’s apparent that the sides of the marker aren’t parallel lines in the image, even though they are parallel in real life. Because the camera is a perspective camera, we need to use a perspective transformation (in which parallelism isn’t preserved). In order to calculate this type of transformation, we need to know the coordinates of four points in each basis. Conveniently, our quadrilateral bounding the marker has four points which we know the image coordinates of, and we know the real-life position of the marker’s corners relative to its center as well.

In order to compute a perspective transformation, we need a new coordinate system. Consider a point which appears on the horizon: it would be mapped an infinitely far distance away. Representing this isn’t possible with only X and Y coordinates, so we need a third coordinate.

The solution is a system known as homogeneous coordinates. To map our existing cartesian coordinates to homogeneous ones, we just include a 1 as the third coordinate. To map them back to cartesian coordinates, divide the X and Y coordinates by the third coordinate. This solves our horizon-point problem: Points with a third coordinate of 0 will have no defined equivalent cartesian coordinate.

(While the system is operating normally, points on the horizon line shouldn’t ever actually be visible. Despite this, we still need to consider it in order to define the perspective transformation.)

Using our new homogenous coordinates, we can calculate a transformation matrix which transforms coordinates in the picture to coordinates in real life (and invert that matrix to go the other way).

3. Marker Identification

Because we know the real-life coordinates of the corners of each square, we can use the transformation matrix to map them to coordinates of pixels in the picture. The yellow grid in the image shows where the squares are expected to be.

At least one square is guaranteed to be each color. In order to compensate for varying lighting conditions, the system identifies the darkest and lightest squares, and then compares the lightness of the other squares to determine if they are colored black or white. On the left of the image, the grey square is the average lightness of the darkest square on the marker, and the white square is the average lightness of the lightest square on the marker.

No pattern of squares is rotationally symmetrical, so the orientation of the pattern can be used to determine the orientation of the marker. The system can now draw the position and orientation of the marker over the camera feed.

4. Global Reference

I’ve added a few new markers into the scene. These markers have a different code on them: one black square and three white squares. I’ve also placed down a ruler for scale.

The system will distinguish between the original markers and the new markers. It plots each of the new markers on a grid, as well, relative to the original marker. You can see from the ruler that the furthest-right marker is about 12” away horizontally and 0” away vertically from the original marker, and the system plots its position as (12, 0).

The different square codes signify the different types of marker present. The original marker, with its 3-black, 1-white code, is the global reference - its position and size are used to plot the positions of the other markers. This means that the size of the global reference matters, but the size of the other markers does not (and you can see each marker in the image is a different size).

Mathematically, the hard work is already done: we can use the transformation matrices for the reference marker that have already been calculated. In a previous step, the system used the transformation matrix on the corners of each square, mapping each point’s coordinates in the scene to its coordinates in the picture. Similarly, we can use the reference’s inverse transformation matrix to map each additional marker’s coordinates in the picture to its coordinates in the scene.

The system plots an overhead-view grid of each visible marker. Note that the orientation and position of each marker is visible. Additionally, the size of each marker is known: smaller markers have a shorter arrow.

5. Post-Processing and API

The raw plot can be kind of jittery because of image noise, low camera resolution, etc. The system will average the position of each marker over time to get a more stable and accurate plot. This smooth plot is also available using a REST API for other devices to use.

In order to do this, however, the system needs to keep track of each marker. In each new frame, when markers are found, it needs to determine if each marker is a new marker introduced into the scene or an existing marker being moved. To help keep track of markers across frames, letters are assigned. This also helps API clients keep track of markers.

Additional Feature: Labels

There is another type of marker with 2 black squares. These are labels, and they contain additional squares below the marker to identify it. These squares are read the same way as the squares inside the marker. As shown in the readout, this label has the number “1”. Labels show up with their number on both the raw plot and the smooth plot, and are identified by their number through the API.

I made a sheet of all possible labels. It’s difficult to see each one on the camera view, but the two plots show each label from 0 to 15.

Results

This project proved that a system using an overhead camera and computer vision markers can be a reliable method of localization for small robots. However, there are definitely issues with the approach that I did not anticipate.

Unreliable Marker Corner Detection

In this system, the corners of each marker were used to create the transformation matrices. Because of this, any issues in the detection of these corners will distort the resulting transformation and affect the calculated position of all other objects in the scene. This means that slightly misshapen markers, backgrounds with a similar hue to the markers, and other small issues can cause extremely inaccurate results.

This could be mitigated by using points within the marker to calculate this transformation, but overall, extrapolating one small reference marker to cover a large area will magnify any errors caused by lens distortion, image blur, etc. and make the entire system much less precise.

In order to increase the precision of the system, multiple reference markers should be used. However, this raises questions: How will the system know the distance between each marker? Will users be required to precisely set up multiple markers at exact distances from each other? In any case, this would make the system more difficult to deploy and use.

Unreliable Marker Filtering from Background Objects

Because the system detects markers based on color, any other blue objects in the scene may cause markers to be detected which aren’t actually markers. Additionally, color detection is unreliable because changes in lighting type and strength can influence how colors appear.

An excerpt from the QR code specification describing how the "Finder Pattern" can be recognized.

A common method for detecting other sorts of computer vision markers is to look for some pattern in the lightness of the code. For instance, QR code scanners look for the 1-black, 1-white, 3-black, 1-white, 1-black pattern in the corners of each code. Designing a computer vision marker which can be identified using this technique would significantly reduce the error in this portion.

However, using a method such as this one would require more processing power, making the system slower and less capable of running on low-end hardware like the Raspberry Pi. It might also reduce the camera resolution that the system can run at.

Poor Coverage Area

A diagram of the effective coverage area of my camera-based LPS system.

Because of the camera’s limited resolution and field-of-view, the usable area of the system is weirdly shaped. This limits the effectiveness of the system, as most areas are not this weird shape. Multiple cameras could be used to increase the effective area, but this complicates the system and increases the cost significantly.

Closing Thoughts

Despite these issues, the system does accomplish the goals I set out at the beginning of the project. However, because of the system’s mediocre accuracy and reliability, I do not anticipate using it for anything in the future in its current state, although I do think this approach to localization is potentially useful (albeit with a lot more refinement).

I am curious as to how much better this sort of system would work if some of the aforementioned solutions were implemented. A system using four separate reference markers instead of one, QR-style finder pattern codes to help objects be more accurately recognized, and perhaps multiple cameras positioned on opposing sides of the area would likely have a much more useful accuracy. Although such a system would be more difficult and expensive to set up and operate, it would still be a practical and economical solution for providing localization capabilities to one or more robots in an enclosed area.

FAQ (Fully Anticipated Questions)

What language was this written in?

I built this using Python, with OpenCV handling some of the heavy lifting for computer vision. I chose Python because I was familiar with it, but it would definitely be possible to implement a similar system in JavaScript for a web version.

Why invent your own computer vision marker instead of using some existing standard?

I tried to use QR codes, which would’ve saved me from doing most of the computer vision work and would’ve allowed me to focus on the linear algebra side of things, but the camera resolution was too low for that to work well. QR scanner libraries are built to recognize one or two codes positioned close to the camera, not many small codes positioned far away.

In hindsight, I’m kind of glad I had this problem, because working with computer vision was honestly pretty fun. I hadn’t really done anything in this field before.

Why is the camera resolution so low?

I ran this at 480p because OpenCV isn’t able to grab 1080p video from my camera any faster than 5 fps. (Which isn’t really a problem on its own, as things aren’t generally moving that fast, but my camera has rolling shutter, which warped any moving markers on the screen into an unrecognizable state.) The processing done by my software doesn’t really take that much time, so it should be possible to run this sort of system at much higher resolutions.

Why does the hue filter still capture non-blue things (e.g. the purple pencil and textbook)?

My camera doesn’t let me turn off auto white balance and automatic exposure compensation, so these parameters were frequently changing while I was trying to work on my system. Because of this, the hue range (and lightness/saturation ranges) I chose had to work for all sorts of white balance and exposure settings.

Why not just buy or borrow a better camera?

The whole point of this project was to solve a difficult problem - precise localization - using tools I already owned. That means working around the inherent limitations of the hardware. I also wanted to try to make the system as cheap to deploy as possible, so I used my cheap-ish and commonly available webcam.

Why blue painters’ tape for the border?

I tried using red, but my skin kept making it past the filter and screwing up the marker detection every time I reached into the scene. I also had a lot of problems with marker on paper being surprisingly glossy: portions of the border were appearing white on the camera instead of blue because of reflected light.

Why only 16 possible labels?

At first, I let labels have an arbitrary number of squares. I found that the further squares were from the marker border, however, the less accurate the calculated position (yellow grid) would be, and these multi-line labels had a lot of scanning errors as a result.

Why not use colored squares to determine the marker type? This would let you define many more types of marker.

This works fine for two colors, but trying to distinguish between any more becomes difficult, especially in a variety of lighting and white-balance scenarios. I tried using red-green-blue, but there were too many errors for it to work well.

]]>
https://breq.dev/projects/lps /projects/lps Tue, 01 Sep 2020 00:00:00 GMT
<![CDATA[McStatus.js]]>

Overview

McStatus.js is a JavaScript library and API backend to embed a Minecraft server status readout.

Motivation

I wanted to be able to check who was online on my Minecraft server without having to join. I also wanted a project where I could improve my understanding of JavaScript, the DOM, and web technologies in general.

Technical Details

mcstatus.js loads a div complete with Minecraft server information into the DOM wherever it finds a <div class="mc-status">, using the data-mc-server attr to set the server IP. The status protocol for Minecraft servers uses raw TCP sockets, so a pure-JS server query-er isn't possible. There are a lot of existing Minecraft server status tools, like mcsrvstat.us, but they don't have a CORS header set, so they couldn't be used from JavaScript. So, I implemented my own at https://mcstatus.breq.dev/ with the bare minimum API for this project to work. I'm using Dinnerbone's Server Pinger under the hood for this.

Results

It pretty much does everything I wanted it to do, and working on the project definitely gave me a better understanding of how to style websites using CSS. I didn't really end up using the CSS/JS part for anything, but I did use the server component for Breqbot's Minecraft server functionality.

]]>
https://breq.dev/projects/mcstatus /projects/mcstatus Sun, 16 Aug 2020 00:00:00 GMT
<![CDATA[Vibrance]]>

Music in the video is made by my sibling, Max Michael.

Overview

Vibrance is a tool to use a large number of computers or smartphones as a single output to create lighting effects. By setting the color on each screen in unison, colorful and interesting visual effects can be created. Vibrance also handles generating these effects based on some input, such as a Digital Audio Workstation, keyboard input, or custom-made lighting controller.

Motivation

I remember seeing a video of a Coldplay concert in which every audience member had an LED wristband which would light up in sync with the music. It seemed like a cool way to involve the audience in a concert.

I wanted to make a light show like the Coldplay concert that would cost almost nothing. No big screen, no manufacturing hundreds of bracelets, just everyone's personal devices and some controller software.

Technical Description

The Vibrance system has three main parts: the controller, the relay, and the clients. Working backwards from the clients:

Clients

The client code is a JavaScript app that runs on audience members' phones or computers. It receives messages from the relay and changes the color displayed on the user's screen accordingly. These messages are JSON objects, which allows for a variety of extensions to be added. For instance, it is possible to direct a client to display certain text on its screen (song lyrics, a welcome message, etc). Because this code runs in JavaScript, and the messages require very low latency, I chose the WebSocket protocol.

Relay

Most of the time, both the client devices and the controller will be connected to a public WiFi network or a cell network, and thus be hidden behind NAT. So, an intermediate server with a public IP address is required to facilitate the connection. The Relay fills this role.

Because it would be impractical for the controller to calculate separate colors for hundreds of clients, the clients are divided into zones. Client devices will indicate their zone to the relay (typically after prompting the user to choose which part of the room they are in). Messages sent to the relay by the controller are marked with their destination zone.

Upon receiving a message from the controller, the relay forwards it along to each device in its intended zone. In the background, the relay manages newly-connected clients, ensures connections are kept alive, and removes inactive clients.

Controller

The controller is the most complicated part of the pipeline. It is responsible for handling input from some device (through a driver), determining the color messages to send to the relay (using an interface and script), and sending these messages to the relay.

Driver

The role of the Driver is to read input from some source and return it in a common format. Drivers can use a variety of inputs, such as MIDI ports, computer keyboards, and serial ports (e.g. for use with Arduino).

Virtual MIDI ports in particular are quite useful for reading input from a Digital Audio Workstation such as Ableton Live. By placing MIDI notes at certain points in the song, lighting effects can automatically be triggered.

Users can implement their own Drivers to use different controller devices (joysticks, Wii remotes, devices over the network, etc.)

Interface and Scripts

Users supply Scripts (written in Python) that describe how to translate inputs from Drivers into messages to send to the Relay. These scripts use an Interface to manage the drivers and relay connection. The Interface allows Vibrance to recover gracefully from a temporary connection failure, and it allows users to change the connected Driver and Relay at any time.

For example, here is an example script that flashes zones 1 and 2, with a 1 second delay in between, whenever MIDI note 60 is received:

@api.on("midi", "note_on_60")
def animation(event):
api.color(1, "FFF")
api.color((2, 3, 4), "000")
api.wait(1)
api.color(2, "FFF")
api.color((1, 3, 4), "000")

A variety of example scripts are provided by Vibrance.

Block Diagram

Results

I don't think I'll have an opportunity to test Vibrance in the real world any time soon, because of the COVID-19 pandemic. However, I have tested it with multiple devices and with hundreds of simulated clients at my house.

I started this project a few weeks (!) before the stay-at-home orders started, and I had initially planned on trying it at the next opportunity my sibling had to perform their music. As an alternative, in the video I filmed demonstrating Vibrance, I emulated their Ableton Live setup as closely as possible.

]]>
https://breq.dev/projects/vibrance /projects/vibrance Mon, 15 Jun 2020 00:00:00 GMT
<![CDATA[Red Storm Robotics]]>

Here's a video of teams 4393X and 4393Z vs. 344R and 344X. At the time, I was on team 344R. 4393X was made up of the people I worked with back at MSSM.

When I returned to Scarborough, I worked with the engineering teacher to start a VRC team. I helped with searching for and applying for grants, ordering parts, and building out a workspace. I then took on a leadership role in Team 344X where I helped show my teammates how to design solutions with VEX parts, how to build efficiently, how to program the robots, and (above all) how to think critically about a challenge.

In January, the team raised enough money from grants to spin off a second team, 344R. Several other students and I built a second robot (effectively starting halfway through the competition season) and competed under that number, eventually qualifying for the state championship and the world championship (both of which were canceled due to COVID-19).

In summer 2020, I decided to give back to the VRC community by contributing code to OkapiLib, a library used by hundreds of VEX teams for advanced functions such as motion profiling, PID control, and asynchronous movement. Notably, the 2019 World Skills Champions used OkapiLib for path generation in their Programming Skills routine. The library has almost ten thousand downloads on GitHub.

My pull request added support for trigonometry, rounding, exponentiation, and other math functions to OkapiLib's units API. Code that uses this API is checked at compile-time for common units errors, eliminating most mistakes caused when a programmer enters a formula incorrectly. My addition of math functions allows users to write formulas that use more than the basic four functions (add, subtract, multiply, divide) without sacrificing this compile-time units checking.

]]>
https://breq.dev/projects/344 /projects/344 Sat, 13 Jun 2020 00:00:00 GMT
<![CDATA[MSSM Penguins Robotics]]>

Here's the autonomous routine I planned, developed, and tuned for the 2019 Maine State Championship on team 4393S. This autonomous routine got us the highest autonomous points ranking out of all 50 teams at the event.

This was my first experience with a robotics competition. I did most of the building and all of the programming for my team. Our team won two local tournaments, was a finalist at the state championship, won the design award at the state championship, and competed at the world championship. This experience improved my mechanical skills, understanding of the design process, embedded programming skills, and teamwork abilities.

Here are a few of the things I'm most proud of from my work on this team:

  • Automatic aim using the VEX Vision Sensor to recognize flags
  • Buttons to automatically shoot multiple flags in quick succession using state machine logic, including sensors to track the position of balls, automatic reloading logic, and angle presets
  • Movement in an arc for saving time during the autonomous routine
]]>
https://breq.dev/projects/4393 /projects/4393 Sat, 13 Jun 2020 00:00:00 GMT
<![CDATA[Mindjacker]]>

Overview

This was a wrapper for nxt-python I wrote while I was in middle school for projects like the R2D2.

Motivation

I liked the featureset of nxt-python, but I wanted to make it more Pythonic and add some common features (like playing audio) that I often used in robots.

Technical Description

It's a wrapper API, there's not much to describe.

import mindjacker
brick = mindjacker.Brick()
brick.move(["b", "c"], 100, rotations=5, steer=50, brake=False)
brick.playSound("sound.mp3")
measurement = brick.ultrasonic(1)
print(f"Ultrasonic sensor measures {measurement} inches")
brick.write("data.log", measurement)

Results

This was one of the first APIs I actually designed. It's a pretty flawed design, but it was a learning experience. This was also one of the first times I wrote software to make it easier for me to write more software, and I decided to make it open-source.

]]>
https://breq.dev/projects/mindjacker /projects/mindjacker Sat, 13 Jun 2020 00:00:00 GMT