This was my exit document, crafted at the end of 2023 when I was preparing to retire from Google as an L7 IC after working there for 16 years. My main focus at Google was data and analytics.
Looking back at the last 16 years is an intimidating task. Google and I have changed a lot. There is a lot I could say, but what I feel is the most important is what I have learned and how I have grown. Things I may forget when being out of the software world for a long time but will always stay with me.
I have a tendency to latch onto pithy phrases: inventing, stealing, repeating, condensing them. I find them broadly applicable and quite memorable. This is a collection of pithy phrases I have found helpful and some longer explanation for the nuance behind them. Maybe they will be useful to you and maybe you will find some of them funny/apt enough to remember.
Like one? Have another one to share? Feel free to comment!
On Engineering
Accountability is strength. There is no better source of self improvement than to continually hold oneself accountable. Accountability is not blame or fault. I don’t even think of it as external. It’s something one is and something one holds themself to. It’s about continual self retrospection. Understanding what actually happened, the cause and effect of various potential and actualized actions. Once I’ve done that, then I tell others and see if it aligns with their perspective. They often have interesting insights, usually aligned, but also different. And I have somehow always found that people believe me more intelligent and competent for continually and vocally admitting failures and successes.
You can never count the living. The scene is such: a fantasy general, mourning the dead post a large battle, worried about the cost of their decisions. An advisor offers the wisdom, ‘It is easy to count the dead, but you can never count the living’. Paths diverge, choices are made, you can never tell what would have happened had you chosen the other path. Statistics can help, but they are harder to calculate and are more removed from emotional significance. These types of tradeoffs are hard and constant. Do you fix the reliability bug or launch? You are in a Schrödinger’s experiment, the former is an OMG or nothing, the latter is decreased adoption or maybe missed customers. OMG’s are costly, both to external perception and the internal opportunity cost, but so are customers. Being slow is a cost. But it goes a level deeper, maybe it’s not a full OMG, perhaps it just causes some customers to leave decreasing adoption, the initial ‘wow’ moment is gone, they never try again. Maybe that one customer was a big client and not having it lost a huge contract. It’s like the butterfly effect, but in real time. Attempting to fully account for this can lead to a decent amount of decision paralysis and many different approaches are fully justifiable depending on the decision maker’s internal beliefs. So what do you do? Well for starters, it’s valuable to realize that often the most important factors in a decision cannot be measured so getting in a ‘measure off’ for certain decisions is rarely productive. Do not over rely on metrics. It’s also important to realize that expertise provides value especially when it cannot be numericalized. The way I try to tackle it is to think about longer term, what would happen if this decision was made in the same direction 1000 times. Iterative decision making causes probabilities to smooth out and tradeoffs become clearer. It doesn’t mean that you always follow that, oftentimes a mixed strategy is optimal, but it gives clarity and emotional resonance on the tradeoffs in a way that thinking about a single decision can often lack.
Deliberately counter entropy. Fight against tragedy of the commons and gradual codebase decay. Take extra time to make others life’s easier. This is especially important if you are working with a remote team. Any remote team collaboration takes a lot of effort to be successful, be a part of the solution. Give extra grace, spend the extra time, file the bugs, make the hotlists, help in ways it’s clear that are valuable but people usually don’t want to do because it’s a bit annoying. Match the style of the codebase you are in. Make things better. Be a person people want to work with again, it will do wonders for your happiness and career. Don’t spend all of the time on this, but definitely spend some and it’s hard to spend too much.
Do the same thing… but better. This is a common, but ineffective long term strategy for improvement. It usually comes about when the process depends on the human’s repetitive attentiveness. So things like ‘pay more attention during CR’ or ‘next time we won’t take on so much technical debt’. Sometimes these can shift behavior for a bit, but humans are not robots. To make meaningful changes requires updating policy/process to accomplish. Optimally this is some form of automation. It is definitely possible for an issue to not be worth the effort to fix and this type of ‘solution’ sometimes be used as a proxy for that, but it’s better to be explicit when these tradeoffs are happening (even though it does take longer to justify). (AKA hope is not a strategy)’
No problem that wasn’t solved by two emails is solved by the third. Set up a meeting and talk to each other. No really. Do this! Yes, even if the timing is inconvenient. Everything is easier with social context and visual reminders that the other person is a person.
It is far more important to ask the right questions than to know the right answers. Knowing the right answers solves one problem. Knowing how to ask the right questions can solve all problems. There is an art to knowing what you do not know which is critical to higher level software engineering (and probably other things too).
User error is a product bug. User’s shouldn’t be confused by our products.
Don’t become a “quality” team. Names matter. Naming yourself after a very common and large area is a bad idea. Both external and internal teams will be confused, you will get assigned more bugs then applicable, and you don’t want to absolve anyone of the responsibility of the quality of the product. Be specific in what is in your mission.
On Leadership
You don’t solve greed by giving it more. This is a cautionary saying and can apply in a lot of ways. In software I mostly recall it in scenarios related to burnout. Engineers push themselves and can be asked to push themselves beyond the point of sustainability. This behavior is rewarded heavily. Short term and for important things this is expected and fully normalized. Engineering practices follow an ebb and flow stress model (which isn’t to say it’s required, just that it does). The key issue arises when ‘exceptional circumstance’ becomes business as usual. Unsustainable motivational tactics are easy to exploit as there are always opportunities being missed. This exploitation can be a deliberate strategy and is often a local maxima depending on the Leader’s prioritization (product vs people) and time horizon. It’s only after a couple of years that the cost of attrition, burnout and narrowed collaboration is felt and who’s to say the engineers who left weren’t worth the launch. It can be in the Leaders interest, the company interest and the shareholder’s interest, but it is rarely in the employees interest… for long term (short term it is heavily rewarded and different constitutions are affected differently). Red flags are phrases like ‘existential crisis’, ‘tightening belts’, ‘family culture’, ‘you don’t want the team to fail’ and behaviors like ‘pushes at the beginning of a product life cycle’, ‘deadlines defined externally’, etc. As an individual, be aware, it can be a feature not a bug, advocate for yourself and when you evaluate if it’s worth it to work at a pace you know is unsustainable long term, remember: you don’t solve greed by giving it more. (related: you don’t solve a lack of prioritization with more headcount; vote with your feet; why heroes are bad)
It’s easy not to see if you don’t look. There is a class of leaders who use ‘I didn’t know’ as a defense whereas I have always felt it is an admission of guilt. It’s not possible to be perfect and no one expects any leader to be, but not knowing about a large problem inside your org generally means being new, not listening or not being trusted. All of which are things that need to be addressed independent of the underlying issue being raised. Generally when this has happened in a high profile situation, the evidence is everywhere and it takes deliberate blindness to avoid it. As a leader, I feel it is extremely important to continually take temperature checks of all the roles involved, foster a culture where it is easy to object and make sure any issues are caught early. Preventing issues is a lot easier than fixing them so always be looking.
You don’t get out of a hole by digging slower. Improving something does not mean fixing it. Making something better does not mean it is good. Improving is still critical, but it is very important to re-evaluate the end state independent of the previous state. Is the latency/reliability/WLB actually good or just better? Humans are very good at accidentally evaluating things only in delta. It’s especially important to make sure to not fall into the trap that since things have improved then there is now slack to give.
Always give people space to surprise you. Delegating is hard and in some cases can take longer/be worse than just doing the thing. This makes delegating feel incorrect when thinking about it from the perspective of ‘accomplish a task’. But delegating isn’t that. Delegating, especially to junior team members, is an investment in the future. It’s much more analogous to tutoring than to accomplishing a specific goal. Sure, the goal needs to happen, but that’s secondary. It works best when you understand the area sufficiently to teach it, then give as little direction as possible and check in frequently. People often rise to the challenges given to them. It always feels a bit odd to “know” how to do something and then not tell the person, but they will learn more and they may come up with something you didn’t even think of. Hence, always give them the space to surprise you.
Thank me with action. This is my standard response whenever being thanked for bringing up some issue or concern to the person who has the ability to address it. Corollary, I try to emphasize what actions would help a situation and follow through.
Be careful: sometimes leaders actually mean what they say. Yes, even if you and many people around you are in agreement that they couldn’t mean that.
9 women + 1 month => baby. Mythical man month mostly.
On Data
Fail to prove yourself wrong. Data is very good at conforming to confirmation bias. Proving oneself right is often trivial even with bad hypotheses. Once there is enough evidence that corroborates then it is natural to stop and declare success. Instead, take an adversarial approach. You aren’t trying to prove the hypothesis correct, you are trying to prove it incorrect. People are a lot more thorough when attempting to prove something wrong. Rather than waiting for some amount of corroborating evidence, it requires seeking out all avenues of contradictory evidence. Once there is no other potential, then the hypothesis is valid to share. You will know you are getting good at this technique when others mention ‘have you looked at X, Y and Z yet’, and you realize that, more often than not, you already have (and then you get better at keeping screenshots and naming scripts).
Data doesn’t lie. There is a tendency to push off abnormalities in data that do not make logical sense. This is very problematic especially long term since it generally means one of three things: 1. There is a data bug which should be fixed. 2. You don’t understand what is happening which often means the ‘fix’ will not work due to incorrect root causing. 3. You don’t understand the data, which can have long term implications on the health of the system and your own hypotheses. It’s important to fully understand all data incongruities.
Data bugs are real bugs. It’s important to treat data bugs with as much priority and almost as much urgency as user visible bugs. It can be very tempting to delay these, especially under crunch time, but this type of justifiability never really goes away and it can become easier and easier to push off.
We do not fix. I have been part of several teams focusing on analytics, debugging, support, etc. There can be a somewhat overwhelming external pressure to stop working on that type of thing and instead work on features or fixing product bugs. The value of these roles are most apparent when they are missing and teams who prioritize via crisis tend to have short memories. It’s a bit like the push back that is needed to keep up with technical debt, but for 100% of your and your team’s job. Fortunately, I am mostly impervious to this type of pressure and I have been in teams that have allowed me to operate when I push back. But it does get tiring to always needing to advocate for the value of my work. To internally help me handle that I’ve adopted the moniker ‘We Do Not Fix” as a tongue in cheek reference to the House Greyjoy’s “We Do Not Sow”. Whenever pressure comes I find comfort in repeating it and have been tempted to replicate a pennant in a similar vein but never got around to it.
The map is not the territory. This one is quite famous, but it comes up all the time as one of the perils of over optimizing metrics.