I decided to change the format for this AMAA for one very important change, and that's to have guests to help answer some questions! In case you did not know Bluesky has recently hired two DevRels, Alex and Jim! Both were kind enough to take time out of their busy schedules to be guests on this version of ask me anything atproto! So you will see them chiming in along with me on answering these questions.
AG - Alex, Bluesky DevRel
BT - Bailey Townsend, Me, independent AT Protocol Dev
Questions from you!
These are questions I have collected from users(you the reader) over the past week.
Thoughts on DIDs and adding more DID methods?
AG, BSky DevRel: I think DIDs are low-key one of the coolest aspects of the AT Stack, and reveal a lot of our principal design goals: They're based on a W3C standard (for did:web), but made more usable and portable (as well as enabling neat DNS features) through our use of did:plc; and the PLC itself both showcases our history of thinking hard about user-generated authority and is lowkey one of the most widely-accepted parts of our "solve for decentralization" stack — no one is particularly psyched to try to host their own (because that doesn't enable much other than redundancy) but everyone accepts that this is possible. I can see us trying to add some more granularity to handle/DID lookup methods in the future; most of the team are heavy users of https://pdsls.dev/, a community project that provides an web UI for individuals' data repositories, and we're doing a lot of thinking around Lexicon interoperability and how to provide more interfaces to Atmosphere social graphs.
BT: I honestly like did:plc's. I think that, along with other design patterns, are crafty ways to do a type of decentralization at scale. I allows a low barrier of entry for anyone to join a social media platform that allows them to have a credible exit to another host for their data. When I signed up, I didn't fully understand or know that Bluesky was like that. I joined to make friends. A lot of other users are in a similar boat, and it allows them to have a similar experience on the Atmosphere as they would on other more centralized systems. That does require a lot of trust in the organization that holds the PLC. did:web's are also really interesting because they allow users to control that level of the protocol fully. The gotcha there is that most people don't know or care about them till they have already created an identity with a did:plc, and there is no migration path to a did:web from a did:plc. I think personally that any other type of did would sadly have the same problem. I've been really excited to see the work in the community to create replicates of the PLC, as well as the discussion on a"firehose for the PLC" from Bluesky. I think this will probably be the way forward. Any future version of a did seems to always fall down to having to trust some kind of authority, and we already have a pattern for that with the PLC. Just being able to spread that trust out will be important. I also think helping users understand a bit more about Rotation Keys and making easier ways for them to create those, along with backups and ways to manage them, will help a ton too with the did:plcs.
JR: Agree with Alex that DIDs are one of the most interesting low-level components of the protocol and really a big part of what makes the whole thing, especially portability, work. I think it's unfortunate that the flip side of that is DIDs are sometimes where I see people confused about Bluesky or the Atmosphere being a "crypto project", but ultimately I think the AT Protocol is a validation that DIDs have real world use cases that have nothing to do with blockchains. I could see support for a method like did:key as we continue to explore private messages or other peer-to-peer features that might work without a registry.
Anything you've wanted to do with atproto that you haven't yet?
AG: There are a lot of community projects that we're working fairly closely with, and though I would never play favorites (the Atmosphere Community Manager Boris Mann recently said something along the lines of, there are no "first party" or "third party" AT developers, which I strongly agree with), I have been really impressed with the rise of https://leaflet.pub/ lately. I think it proves that long-form and short-form writing on the internet really are not and have never been that far apart from one another, can be enabled by the same tooling, and benefit a lot from being colocated on the same graph. I'd love to see what we or others can do to make this even more evident — I think we could see some RSS-like use cases but without the "metadata-only" design that led to RSS usage fracturing after Google Reader.
BT: Whew. ALOT. I have a pretty long list I keep. The top 3 currently are a bit more educational or for fun. I've really been into Svelte lately and it's on my short list to write a Statusphere, but for pokes (yes like the Facebook ones). Will be in SvelteKit, confidential OAuth, and a small step by step intro on how you can use Docker to host it on a cheap VPS. Goal is a demo project with a written blog post for others to take and build on. I really want to build at://chess. Keep it all on protocol no appview/backend as a "Hey this is kind of wonky, but it works and you can just do wild stuff with atproto all client side." Finally is a PDS. Mostly just to learn. jacquard has all the lego pieces there just waiting for someone to make a PDS. It would be in rust, pretty 1:1 with Bluesky's, but would have traits for the Account Manager and Actor Store meaning it should be easy to fork and switch data stores. It would just be re writing the storage layer. But there is also some internal debate if I shouldn't just look more into cocoon since my techincal end goal is something with a lower memory footprint, which I think cocoon has.
JR: I'm incredibly excited by all of the work that's percolating beyond just building (better!) social media apps. The work that Tangled is doing reimagining social coding is such a great use case. Alex mentioned Leaflet and I love that they are doing some smart work rethinking social blogging — blogging is 25 years old and there's still new ideas there! My background is in media so I'm pretty optimistic about how publishers will use AT Protocol; I think there will be some cool use cases by traditional publishers but I'm more confident that entire new media organizations will start with AT Protocol.
What's the best approach to migrate collections from one NSID to a new one?
what would you do if you decide for your app to write to a `app.foo.entity` collection after already accruing records in a `app.bar.entity` collection? like a big "click me to migrate"? i.e. "turns out someone owns bar{dot}app and i want to use a namespace consistent with the domain i own"
AG: As a humble Devrel I haven't had to do too much of this at scale yet, but if I did I would probably start with the Microcosm tooling. We're also trying to take goat, our internal CLI, to 1.0, so we can deprecate pdsadmin.sh, and figuring out what needs to be made easier in there is an interest of mine.
BT: It's an interesting thing. I think even pass just the base NSID there is also the pattern to have a "alpha" lexicon while in development or early stages. May have noticed but teal.fm's play lexicon is fm.teal.ALPHA.feed.play. It's something I haven't had to do yet, but I think if I were having to it would be something I would do on the fly for users. I would support the reads on both for a while on the UI, as users login and interact with the atproto app I would then either migrate it in the background, or may just have to do a button. smokesignal.events acutally did one semi recently as well. A bit about it here. It was a button in the settings and was communicated via the UI from what I recall.
JR: Agree we've got some work to do on the tooling here to make these kinds of features more seamless. I'll just add this is why it's important to get your namespacing right!
Questions from me!
Since both Alex and Jim were kind of enough to accept my invite to be guests on this AMAA I thought I would throw in some questions as well to ask just the.
Why is constellation so interesting and helpful to developers?
AG: I just shouted out Microcosm up above, but I can get a bit more in-depth here. I often wind up trying to communicate the assumptions made in our stack through infrastructure diagrams or other write-ups, and one of the principles that comes out of that is that you really don't need Relays or App Views to build out the set of interactions made possible through Bluesky or other Atmosphere apps; all you really need is a Lexicon and a PDS. However, you do need those other bits to do this stuff at scale, for "big world" social. Constellation is a great, efficient, minimal set of patterns for "how do we resolve interesting queries at scale using this data," without needing to actually run them through the Bluesky App View.
JR: It's pretty easy to get overwhelmed by the size of Bluesky and everything being published across the broader Atmosphere. We've become so accustomed to using terms like "firehose" that it can be easy to forget where it comes from (the debt this world owes to Weird Al will never truly be known). Microcosm and Constellation are making it easy to get started, "to just build things" as we say, it kind of reminds me of the first time I used jQuery (complimentary!). And I really appreciate the approach they are taking around building community infrastructure, it's just a very smart, thoughtful project through and through.
What is a rough outline if someone wanted to back fill an application's AppView of a lexicon? Like xyz.statusphere.status.
AG: So, backfilling is really interesting, especially if (like me) you'd only worked with firehose-type APIs for microblogging in the past. The whole concept of backfilling is tied to our core assumption that if it's not possible to replicate the whole network of posts in (for example) the Bluesky Lexicon, we have not credibly succeeded in our decentralization goals. Generally speaking, our idea is that, if you want to mirror the whole network (not just to run your own social app, but maybe to do large-scale data analysis that doesn't need to be enabled by us), first you backfill the 30TB or so that's already on there, then you cut over to streaming new data. Backfilling itself is pretty straightforward — get a big Postgres, cook at 350 degrees until golden brown — but cutting over from backfilling to streaming new data has been tricky until fairly recently, and we're excited to (soon) collapse this into a single "sync" primitive that should resolve a lot of the ops layer questions in there.
JR: Backfilling is a great example of the full breadth of the platform. Most apps, most developers, will never need to worry about backfilling and that's great! Some, especially devs interested in building out infrastructure, absolutely will and I'm so glad the support is there, even if it takes a while. Figuring out where your app exists on the "app — infrastructure" spectrum is a pretty important early step and something we can absolutely build more support for.
What AT Protocol application have you been the most excited about lately?
AG: Leaflet, for the reasons I mentioned above. And https://anisota.net/, because it's such a cool way of reusing existing Lexicons. And https://graze.social/ for their recent work with WNYC.
JR: Strong concur on all of those from me. I really like what Graze is doing making it possible for anyone to build custom feeds, I think that's another powerful abstraction on top what can be an overwhelming feature. Anisota is such a clever approach to building a client that really rethinks the primitives of the feeds we've been kind of stuck with for two decades now. Sill is another favorite of mine from Tyler Fisher that aggregates the links from your feed to create a kind of personalized newspaper. Monomarks is one I've started using a bit, recreating del.icio.us or Pinboard for the era of distributed social. I'm excited for teal.fm to launch! I'll also add how excited I am to see so much focus on building on the web and resisting the "appficiation" of the internet that we've seen over the last 15 years or so.
What has been your favorite language(or sdk) you've used to work with AT Protocol?
AG: Personally, Go, because I've always been a backend developer, and I like how profoundly inexpressive Go is as a language — it makes collaboration really easy so you can focus on other interesting problems. That said, I'm really excited about the new code generation work we're doing to provide really clean primitives for custom Lexicons, and I think TypeScript is the lead SDK there, and I just submitted a talk to PyCon US which I'm particularly excited about, which is all about our use of LLMs to enable moderation at scale with Python.
JR: I'm a Python guy so I'm thankful there's a community SDK, which I'm looking forward to digging into a bit more.
What's your favorite lexicon?
AG: Like I mentioned above, I think Anisota has approached Lexicon design from a really interesting perspective, by asking "OK, what can we reuse from existing Lexicons that are likely to be populated for most users, versus what do we need to add to our own Lexicon to make this app work?"
I've been wanting to make this reference for a while, and it's silly, so please bear with me. If you ever played the original Metal Gear Solid on Playstation 1, that game did a lot of fourth wall breaking, and one totally novel part was during a fight against a boss called "Psycho Mantis," he would "read your personality" by telling you what other video games you'd played, based on the other save data on your Playstation's memory card. This was really fun because he had fully voice acted lines to say things like "I see you also like Castlevania!" and innovative because games weren't supposed to read other games' memory card data for any reason; for one thing, they wouldn't have had the schema for parsing it beyond basic title metadata. Anisota is showing exactly what's cool about the Atmosphere — any app can use another app's schema to see not just whether they also like Castlevania, but what their Castlevania posts are like :)
JR: This is a good reminder that we can do more to highlight existing lexicons as a way to learn some best practices! I don't know that I have a favorite lexicon per se but I really miss Last.fm and the whole idea of music scrobbling in general so I'm gonna go with teal.fm's play lexicon.
What’s the difference between the JetStream and the Firehose?
AG: Honestly, our blog post explains it better than I could here. The Firehose (also called that by other social media platforms) is a pretty well-understood concept; that's where you can listen for alllll the new network events, be they posts or whatever else. It has a well-defined spec, but the ergonomics of working with the Firehose directly are not that great for many developers because of the amount of data involved. Jetstream is a [frequently] more usable Firehose.
JR: The important thing to know is you almost certainly want to start with Jetstream and if, for some reason, you really do need more low level access, then drop down to the Firehose. And if you do find yourself in a situation Jetstream can't handle, file an issue on the Jetstream project.
I want to thank both Alex and Jim for taking time to answer these! I'm excited to have them both as a part of the Atmosphere and excited about the future!