The day began with the keynote (originally to be delivered by AWS CTO Werner Vogels, but instead delivered by Matt Wood), hosted by Darren Hardman. I was surprised to hear Ocado Technologies suggest that a transition to cloud had been more significant in their recent history than the AI you usually hear about from them. We got a good glimpse at their AI ‘air traffic control system’ in their robotics-run warehouse, and the fine-precision communications involved.
They transitioned from server centres co-located with their robo-"fulfilment centres" to the cloud, and ballsily chose to do so on Xmas eve. To top that off, they claim to have never had an outage since doing so 2 years ago.
We also heard the barn-storming success story of Cazoo (car sales as convenient and reliable as ordering on Amazon), who produced a website in 3 months and launched their product replete with its own logistics network in 6. Their stack had an emphasis on serverless: Lambda, S3, TypeScript on Lambda, React, Serverless framework (Infrastructure as Code), GitHub, CircleCI, APIGateway, Terraform, DynamoDB, EventBridge (listed in the case study from the consultants who led its development, Codurance).
I didn’t take much away from the talk by the CTO of Genomics England, whose tech angle was that they used Aurora and Fargate.
Ultimately the keynote descended from technical showcase into a sales pitch, and an eyeroller analogy about how the different capabilities of an org (marketing, operations, …) are like the hills of a map whose contour lines appear over time... Cue eye catching & not so meaningful illustration. This was a shame: NVIDIA’s GTC keynote follows a better trajectory of showcase followed by insights into the future of the industry (albeit much longer).
A fair few new features have just launched (FURLs! Serverless Inference!) perhaps to coincide with the previous AWS Summit in San Francisco a week ago. It would’ve been a nice touch to have more of these pointed out to cap off the talk instead.
Lastly there was emphasis on using the proper database for the job (e.g. time-series vs. graph vs. key-value store vs. document; TimeStream/Neptune/S3/DocumentDB respectively), but this wasn't news.
Towards continuous resilience
There was a great talk on “continuous resilience” as opposed to risk management (from Veliswa Boya).
It felt like a lot of industry best practices got cited explicitly (i.e. with referenced sources) rather than just nodded at as a vague personal preference.
One point highlighted for me (I don’t think I photographed the slide) was how feature flags exist on a continuum: I think the diagram was similar to this one (excerpt from Distributed Systems Observability).
Distributed Systems Observability by Cindy Sridharan (twitter.com/copyconstruct)
The Amazon Builder's Library — How Amazon builds and operates software
The Resilient Architecture Collection — A list of resiliency-related blog posts
The Chaos Engineering Collection — A list of chaos engineering-related blog posts
Immutable Infrastructure — Reliability, consistency, and confidence through immutability
Towards Operational Excellence — On culture, tools, and processes
The Cloud Architect — Build resilient, scalable, and highly available cloud architectures
Four Concepts for Resilience and the Implications for the Future of Resilience Engineering By David D. Woods, The Ohio State University
This talk by Rebekah Kulidzan introduced two complementary approaches to async: ‘orchestrating’ and ‘choreography’, presented with analogies about dancing and buses.
The talk was a call to use Step Functions and EventBridge. Step Functions run a bunch of services in series (the way a bus visits multiple stops) and EventBridge is a handler for events that get passed into Lambda (or whichever other service you interface with). Step functions do ‘orchestration’ and EventBridge does ‘choreography’ in the industry lingo.
Not in the talk, but in the blog post linked during it was that you want business critical processes to be managed end-to-end by one system. What do buses and orchestras have in common? They both have a ‘conductor’... In choreo, every service (or dancer) works independently, “loosely coupled” through shared events (which they listen for, like dancers). The key part on this slide is use of choreo (i.e. EventBridge) as message passing between bounded context of services.
A Craftsmanship mindset - the secret sauce to Cazoo’s growth
This talk was about the principled approach to software development at aforementioned consultancy Codurance. It was less on practical tips or a day-to-day postmortem of Cazoo's success, and more on the industry terms and ways of thinking. We got a brief history from the Agile Manifesto to the Craftsmanship Manifesto. I would have liked to hear the other talk on the practical aspect of this mindset and its methodologies to apply this in my own work, and where they excelled or showed some friction during Cazoo's scaling up.
The talk's key point was that Cazoo chose to hire them for their reputation in software craftsmanship, and a desire to make the right architectural design decisions on the first try: to choose auto-scaling architecture and the right tools for the job, rather than a MVP that’d get rewritten soon after.
This was the gist of the ‘secret sauce’: an emphasis on quality to stave off growing pains. I was able to find the case study online with the technical details.
TDD, IaC, CI/CD… – IYKYK.
This talk had one of the most memorable points of the day for me: why go to all the effort to craft resilient software? It's reflective of the value of the object the software is responsible for. High value products call for high standards of care. I doubt the product even has to be tangible (e.g. it could equally well be 'reputational risk'). The arrival of a new car has both tangible (a high financial value object) and intangible (associated personal/emotional) value attached.
"It's not just ads…"
In the context of this talk it was the traditional transaction of car sales, not only a significant purchase but often a significant personal moment in life: family outings, post-COVID mobility in general, independence. When something goes wrong, you’re liable to cause a lot of grief. These are the things robust and resilient software/architecture protects.
This resonates with our work at Beatchain, where we handle a musician's livelihood, their art. Safeguarding its handling in code thus matters all the more. It’s easy to start getting jaded by Agile-speak — ‘oh they’re just buzzwords’ — but there are real world consequences of an unprincipled approach to programming.
This is also not a simple rejection of "move fast and break things" culture, as Cazoo's rapid growth attests. This is more a philosophy of moving in a deliberate and principled way. Which is fitting when building an in-house logistics network.
As Andrea Saez put it on Twitter this week:
“It doesn't matter, someone will fix it later” shows a lack of empathy for the customer experience and journey as well as for the people that will have to drop everything to fix it.
DevSecOps with Snyk
This was a sales talk on Snyk ("sneak", from so now you know), a platform that scores vulnerabilities based on both how severe and how exploitable they are (the latter even incorporates ‘trending on Twitter’ as a signal). The scores let developers prioritise which bugs to patch first, and this is as easy as clicking a button to make a pull request (with integrations to GitHub, GitLab, etc). You don't even need to wait until you commit your vulnerable code, they have an IDE plugin that can warn you while you're writing.
The code analysis is static from what I saw, and they trained the detection model in some A.I. based way so it's not rigid pattern matching. Excited to try this one.
- Snowflake gave a talk pitching their product, a managed database "warehouse" (for B.I., see lakes vs. warehouses), and mentioned 'Snowpark' where you could run arbitrary Python code on it, and SageMaker notebooks.
- Fun borderline carnival atmosphere (there was one of those claw grabber machines filled with toys for some reason)
- Free tickets! It must have cost a lot to put on an event of this size
- The phone signal at the ExCeL
There were many other talks that sounded good, but you don't have long to decide where to go next after you leave one talk while coordinating with the rest of your group. You could split up and each go to your favourite (more like at a festival), but it'd be no fun if you couldn't find people in time and got left on your own all day. At least festivals have decent reception and your own personal base camp.
This was my first AWS conference, and I suppose they’re like restaurants, bars, or clubs: to “do them well” you need to "get good", to cultivate taste. You also need a sense for your group’s taste, so you collectively gravitate (just like you’ll find it easier to pick a gig or restaurant with someone whose taste you already align with).
The overall tone of the day — maybe my selective hearing — was on serverless being the way things are going. It was useful to dwell on the concept so intensely, and to be honest made me quite confident in how at ease I am with the stack being discussed. A key distinction in serverless is that demand will often be zero: indeed that is the particular difference which SageMaker Serverless Inference introduces (no minimum throughput). After dwelling on that, functions that "keep Lambdas warm" seem like more of an antipattern. What's more, the idea of large cold-start times (for downloading and loading models) in SMSI only emphasise this. On which note, the CPU-only and 6GB memory limitation of this new service left me hesitant: it was a shame I didn't get to see a talk on it anyway.
Before I began using Lambda I was working with CI/CD in a way that I would equally describe as 'serverless' (on-demand, often with no demand but then spiking). As I began to mis/use these pipelines for website generation, I did get a bit evangelical about the concept, which just made iteration so convenient that it felt like the best way to work. For this, the Summit felt like being among other believers, which left me invigorated.
My top tip for AWS conferencing: get there early, otherwise you'll rely on going around the booths with your team. I arrived with just enough time for a coffee in the morning, so the others got free swag and I didn't. :'(
My 2nd tip would be to take whichever device you have with the best camera (phone/tablet). I'd forgotten how frustrating it felt to take a photo just as the slide advances, but damn it felt good to be back in tech talks.
Oh and a 3rd tip: do a little extra reading through the events listings beforehand to rule in/out the various technologies. This comes in handy when trying to choose between talks in a hurry, if you already have a sense of how suitable/what level/even what the technology is.
On a personal note: this was also the first time I met one of my colleagues! Hopefully the first of many more outings, next up PyCon? :-)
- To fill in for the missing keynote, I listened to Werner Vogels discuss The Evolution of Serverless on the Serverless Chats podcast from earlier this month.
- HuggingFace have just partnered with AWS as its preferred cloud provider. In HuggingFace's experiments, a million requests with SageMaker Serverless Inference to use the DistilBERT model cost around 22$.