Staff Engineer Path

Book Review: The Staff Engineer’s Path by Tanya Reilly Tanya Reilly gives a guide for individual contributor software engineers who wish to grow their career but do not want to become managers. It gives insights about what a staff engineer does, and what you need to do to perform at that level. This is a technology-agnostic book. It gives the reader a high level view of the functional areas that matter. ...

<span title='2023-07-24 09:05:00 -0700 -0700'>July 24, 2023</span>&nbsp;·&nbsp;Jose Villalta

Coroutines for Go

Go Russ Cox put out an article yesteday about adding the abilities to run coroutines in go. Today I learned the difference between a goroutine and a coroutine. Coroutine is a concurrency pattern in which only one runs at a time. Say we have coroutine A and B. B waits while A runs then A yields to B and A waits while B runs. It turns out this is useful in a few scenarios. ...

<span title='2023-07-18 06:51:45 -0700 -0700'>July 18, 2023</span>&nbsp;·&nbsp;Jose Villalta

Learning File Systems

Lately I have been learning about File Systems from the book “Operating Systems. Three Easy Pieces” by Remzi Arpaci-Dusseau. I used to think that I knew how file systems worked because the interface open, read and write is so straight forward, what else could there be to it? But then at work some weird issues come up where some weird behaviour happens, like, du says the disk has space but df says the disk is full, what could make that happen? Or when an issue mounting a volume occurs and you realize that you don’t know the difference between mounting a block device versus mounting a file system. Are they both the same thing? I need to have a mental model of what the system is doing in order to debug it. Knowing the data structures in the file system and having an idea of what happens when you open a file, how does the operating system find the file? how does it traverse the file tree? What is in memory versus disk? I must confess I am still in the dark when it comes to container images and union file systems. I understand how a container is made up of many layers over imposed on top of each other. An image is essentially a tar file of different file systems on top of each other. But, how is it implemented? how do you, can you just put a whole other /proc and other system files on top of a kernel? So anyway, that’s what I have been up to. ...

<span title='2023-07-15 21:21:31 -0700 -0700'>July 15, 2023</span>

My Tsundoku Pile

I’m trying to get through all my technical books that I’ve adquired, and never gotten around to. Not going to lie, the Knuth books are intimidating. They are actually not that bad to get through, but they are books that I pick up, read a few pages on a specfic project, try to do a problem or two, and that’s it. The other books are less intimidating, more doable, I’m pretty sure al but one of these books were lying around the Amazon campus just sitting on shelves. ...

<span title='2023-01-22 20:02:32 -0800 -0800'>January 22, 2023</span>&nbsp;·&nbsp;Jose Villalta

Paper every Day. Day ten: The Unix Timesharing System by Dennis Ritchie and Ken Thompson

Link to Paper This paper, published in July 1974 is remarkable because the design decisions that were made back then by these guys working at Bell Labs on an operating system for the PDP-11 are still relevant. I am still struggling to create a mental model of the unix file system, the fact that it looks like a single tree with the root at the top while simultaneously you can have multiple devices mounted dates back to these guys at Bell Labs. ...

<span title='2022-07-31 10:40:30 -0700 -0700'>July 31, 2022</span>&nbsp;·&nbsp;Jose Villalta

Paper every day: Day Nine: An Analysis of Linux Scalability to Many Cores

Link to Paper From the abstract: “This paper analyzes the scalability of seven system applications running on Linux on a 48-core computer…using mostly standard parallel programming techniques -this paper introduces one new technique sloppy counters these bottlencek can be removed from the kernl or avoided by changing the application slightly” This paper has an excellent system level tutorial on scalability. They explain that you don’t get linear increase in performance because in real life applications parallel tasks usually interact, an interaction forces serial execution. Then they list the common causes with common solutions. This paper is throughly written and researched. Writing truly parallel code is difficult and even then applications still compete for some shared resouce, be it a local cache, network access or disk I/O. ...

<span title='2022-07-25 09:13:41 -0700 -0700'>July 25, 2022</span>&nbsp;·&nbsp;Jose Villalta

Lets Go

Go I have been writing go since a little after last year. I actually remember hearing about go when it first came out, back then I honestly never thought I’d be getting paid to work in it. Even though I’ve been writing code in go for a while, I don’t think I know the language in enough depth to consider myself a go expert. I want to change that. So I am going to start writing about go here as a way to “learn in public” Expect posts on the following topics: ...

<span title='2022-07-24 21:42:07 -0700 -0700'>July 24, 2022</span>&nbsp;·&nbsp;Jose Villalta

Paper every day. Day Eight. Omega: flexible scalable scheduler for large compute clusters

Omega Link to paper Omega was the second cluster manager system built by Google. It is Borg’s sucessor and it was designed as a happy medium between Borg’s centralized scheduler architecture and Mesos’s two-level approach where the placement is delegated to the running framework. Omega shares the state of the cluster among leaders and uses optimistic concurrency control (detect when different cluster schedulers are competing for the same resource) The premise of the whole paper is that a centralized scheduler does not scale well, so there must be a better way to handle scheduling different types of workloads in a fast and conrrect manner. The two main types of workloads, services and batches have different requriements and present their unique challenges. The paper explains the type of simulations the engineer at Google used to determine that conflicts among different scheduler is not that common and that Omega manages to fit more tasks in the clusters than Mesos. ...

<span title='2022-07-24 21:30:27 -0700 -0700'>July 24, 2022</span>&nbsp;·&nbsp;Jose Villalta

Paper every day. Day Seven: Large-scale cluster management at Google with Borg

Borg Link to Paper Borg is the cluster management system that runs hundreds of thousands of jobs at Google, it is the original system, it’s sucessor Omega was written as a reaction to the lessons learned from it. Kubernetes is the third system written with the lessons from those two. This paper helped me understand a few things about my own system since we have our own cluster managenet and scheduler system that work a little different but in general do the same job. ...

<span title='2022-07-23 18:56:09 -0700 -0700'>July 23, 2022</span>&nbsp;·&nbsp;Jose Villalta

Paper every day. Day Six: Hints for Computer Design

Link to Paper This paper was published originally in 1983 by the legendary folks from the Xerox Palo Alto Research Center. The hints and tips should sound familiar but it’s interesting to notice the layer the author is talking about, these guys were designing at very low level. The fact that the same rules apply now it’s remarkable. It turns out breaking up a system into the right abstraction with a good interface it’s rather hard. ...

<span title='2022-07-22 08:13:55 -0700 -0700'>July 22, 2022</span>&nbsp;·&nbsp;Jose Villalta