Thursday, July 2, 2015

Software Engineering: Becoming a Team Lead

Here are some thoughts on team leads and leadership. I have had first time leads ask me questions on the transition and what is important. As a long time lead and now manager, here is some of the advice I give to first time leads.


A team leads fundamental role is to maintain and improve the performance of the team as a whole.  As an individual contributor you want to maximize the return on investment of your own time.  As a team lead you want to maximize the return on investment of everyone's time that is on the team.  This can often come at the cost of not being able to contribute as much individually.  But that is okay because you responsibility is not just yourself, it's the team as a whole.  It can often take some time to adjust to this kind of work.  You have to retrain your ability to feel productive so that it is based on the performance of the team and not your direct contributions.

The Bigger Picture

A skill that should already be developing as a Senior Engineer is the ability to shift away from just the implementation and design, to understanding the requirements and their origin.  Understanding the business drivers allow a good lead to give alternatives and options that may solve the underlying business problem faster or cheaper, or sometime both.  This can only be done by understanding who the customer is, what the problem is that is being addressed for the customer and the business value.  Understanding the bigger picture leads into areas engineers typically dislike, such as fuzzy requirements, partial/incomplete information and sometimes having to drive consensus and requirements.


Communication is as much about how to communicate as it is what to communicate.  The what may change from job to job, but there should always  be analysis, summarization and the presentation of options, their costs and benefits.  There are questions regarding priority, overall strategy and just making sure that your manager is well informed about what is going on.  

Intra-team communication is just as important as up-down communication.  Making sure everyone on the team is on the same page. Fostering and encouraging collaborative discussions as well as quick daily status standups. This also includes the team's strategy for communicating and interacting with other teams. Making sure that regardless of who on your team someone works with they get the same good customer experience, and same level of communication.

Identifying When to Add New Process or Procedures

Leadership is about widening your lens and taking in all the individual things, the good and the bad, and looking for recurring patterns.  Things that cost time, resources, that impact value and that recur are good candidates for new processes or procedures.  Raising this recurring patterns to management with options for how to solve or mitigate cost is key to being a good lead.


Leaders at every level are given a certain amount of autonomy when it comes to making decisions.  One of the most challenging aspects of becoming a lead is being comfortable making those decisions, as well as understanding where to draw the line.  Mutual agreement with your manager on the boundaries of your autonomy is necessary, and can be done on an as-you-go basis since communicating the decisions you make is still required.  The boundaries of autonomy are also not necessarily static, they can and should evolve over time as experience and managerial rapport increases. The transition if often from asking permission before doing anything, to informing and potentially being overridden. The later is always preferred as long as the lead and the manager are on the same page as far as priorities and goals.

Sunday, April 26, 2015

Adaptive Value Driven Development

The SDLC process my development group at work follows is based in Agile, but is not Scrum, XP, Lean or any other implementations of Agile.  Instead, we have created our own implementations that focuses on adaptation over time.  So over time we have added, evaluated and kept any practices that added value to arrive at the current incarnation of our process.

The process works within a particular context.  The group is small, less than 5 senior engineers. No one works remotely.  We are an internal tools team whose focus is on resiliency, reliability and rapid release and deployment.

Example Service Summary:
  • 100% uptime (0 downtime) over the past 22 months.
  • 80 releases (~90 features) to production over those same 22 months.
  • 800,000 requests served over the past 3 months.
  • 0 dropped requests due to deployments or unavailable up-stream systems.
This is context in which this process was applied.

The Core

The core of our process is focused on adaptation/evolution.  We iterate on our process in the same way we iterate on our code, to evolve it in response to the requirements placed on it.  We look to refine, add, or remove our practices using iteration.  The central question we ask of every process we use is "Does it provide more value than it's cost?".   That leads us into the definition of value, but at the same time, a short digression on process.

Any process used including an SDLC is there to facilitate work.  No practice or process has value in and of itself.   It only has value if it facilitates work. Value in an SDLC is measured by the ability for a practice to contribute or facilitate in the creation and delivery of maintainable, extensible software in a timely manner.  Any practice whose cost exceeds its value should be changed (to reduce cost or increase value) or removed.  This includes any so called "best practices".  Try out practices, and measure their success, and when you have data make a decision.  Just because a practice is on someone's list of best practices doesn't make it so.

The evaluation of practices is not a one time calculation.  All practices should continue to be challenged.  Not necessarily in every iteration, but whenever the cost/value proposition may have changed.  The SDLC should continue to evolve and change to suit the needs of the team and the software they develop.

The Tenets

Deliver customer value (new feature and fixes) as soon as possible, without service interruptions. Provide the ability to potentially deploy after every fix without a maintenance window.

All engineers are required to be active participants in shepherding the ongoing evolution of the SDLC.  They must collaborate, express opinions, and engage in technical dialectics all in an effort to create the best software they can.

Accountability and responsibility is placed on individuals, never the group.  This is true for projects, project work, questioning or proposing changes to the process.

Be pragmatic, there will always be exceptions to the rule.  Follow the process as much as possible, but realize there will be exceptions.  These exceptions should be explicit (raised up to the team lead or manager) and then dealt with in the most pragmatic way possible.  Exceptions should be exceptional, not the norm.  They should not happen often.

All things being equal, the fewer lines of code, the better.  Always favor implementations that yield smaller code sizes, but are still extensible and maintainable.   This is done as a result of using any and all programming techniques to reduce code size including meta programming,  and mixed paradigm programming (object oriented, functional, aspect).  This leads to fewer defects, less code to manage, and less time in code updates.  Do not confuse code maintainability/readability with technique familiarity or coding to a lesser skill level.

Collaboration is not encouraged, it is required.  The team should be a team, not just a collection of individuals sitting close together.  That means regular communication throughout the day.  This means being open to this kind of communication, and learning how to accept and manage interruption.

Never implicitly accrue technical debt.  The completion of every piece of work involves rigorous refactoring.  In exceptional cases you may have to accrue technical debt, but do so with knowledge aforethought. Pay off accrued technical debt as soon as possible.

Some of Our Current Practices

Keep in mind all these practices can have an exception, but excepting from them should be explicit and should engage the team lead or manager.  Since they are exceptions they shouldn't happen often.

Group Ownership. Everyone works on everything.  Engineers should not regularly pick up cards for the same areas of code.  Ideally, they should pick cards that represent areas they are least familiar with.  It is the responsibility of the engineer to get help from those that have more experience in unfamiliar areas as needed in order to do the work in a reasonable time frame.

Everyone get's an opportunity to be a project lead. As new projects come in engineers are picked to be the project lead.  They are accountable and responsible for the success of the project.  This means they must engage the stakeholders to understand what the software should do, get answers for all the open questions, break down the work into cards, and all the due diligence required for the success of the project.

All work tracks back to customer value. Projects are broken down into cards.  All cards must represent customer facing features (value), or defects.  They must also describe the acceptance criteria based on customer value.

Automate the validation of acceptance criteria. We practice ATDD so the acceptance criteria becomes the acceptance test which is co-developed with the implementation.  A feature is done when there is a working system with that particular feature implemented and the automated acceptance test developed and passes.  As an aside we don't actually use Selenium or any other typical acceptance test framework.  In a normal acceptance test the server would actually start up and the test would then execute as black box calls against the system.  We found there is more value in retaining the control provided by the unit test framework, especially when needing to validate particular exception/error conditions.  So we simulate a network call by actually doing a function call to the top of the stack.  The test then exercises all other ancillary systems such as databases, file systems, SOAP or REST APIs, etc.  This allows us to leverage mocks or dynamic patching to exercise error conditions or more complex scenarios.

Leave clean implementations with a minimum of code. Refactoring is required with every card.  The goal is to have the smallest code size that is still extensible and maintainable. This doesn't mean we pack as much on a single line as possible.  What is does mean is we use any and all techniques to reduce code size.   Engineers are expected to learn and grow their skills, this includes meta-programming, functional programming, object oriented programming, aspect oriented programming  as well as any idioms and techniques specific to the language.  We never code "down" or restrict the use of particular language abilities.  Closures, function wrapping, dynamic class and method manipulation, dispatch programming, DSLs, in some cases even dynamic patching.  These techniques and others are all tools to be leveraged as long as their use is appropriate.

Smaller work items, tighter cycles, less risk. Card are created to be completed in about 1 to 1.5 days as much as possible.  This means the project lead does the job of breaking down feature to granular pieces that can be completed in that time frame, AND provide value to the customer.  Cards that take too long can be noted and managed explicitly.  The lead or manager can decide to accept the longer time frame, or punt on the card and move on earlier and with less lost investment.  The team can also switch to a higher priority much easier; either by finishing the card (1-2 days), or simple losing the investment (1-2 days).  Less risk, more flexibility.  The cost is more work by the lead in breaking down the cards properly, however this also generally leads to a more well defined solution.

You Ain't Gonna Need It Yet (YAGNI).  Write only the minimal implementation that is required in order to meet the acceptance criteria which means a working system.  This doesn't mean ignoring proper design and code layout and putting everything in one function or object.  It means only writing code that is used in meeting the acceptance criteria.  New code for new acceptance criteria as well as, refactors for acceptance criteria that already have implementations.

Track development cadence, understand the sources of variance. Our iterations are similar to Sprints, except they are dissociated from projects.  The reason we dissociate from project timelines is because we constantly have new projects coming in, or finishing.  Planning is always happening, implementation is always happening, all aspects of our SDLC are always happening.  We do not work in phases.  Our iterations are the work week, and we calculate our velocity, our total work completed for the week.  We then keep a running 12 week mean and standard deviation.  When doing planning we are able to use the mean and 1 standard deviation to do a relatively accurate time to completion for a project assuming we have already broken down all the work.  The standard deviation reflects variance in what the team is spending their time on.  For example, this analysis may reveal the team is getting frequent interruptions in the form of questions that may be better addressed to the manager.

Make releases cheap and easy.  We do not use feature branches.  All work and commits are done to HEAD.  Every build is a release candidate as long as the build is successful.  The build should always be successful, build breaks are treated as top priority for fixing.  Our builds tag the repository, automatically add any completed tickets to our ChangeLog, run through our automated acceptance test suite, becomes packaged, and the package is made available from our software repository for installation to any QA server.  The version of the package is the same as the tag in the repository so we retain an audit trail allowing us to track back from installed product to repository tag at any time.

System and software design for 100% uptime.  All software should be deployable without a maintenance window.  In only exceptional cases should we need to take an outage, and generally it should not be a full outage.  This means we design from the start for high availability (Active/Active) because it's relatively easy and there is generally no reason not to.


We have been very successful as a small development group.  We write internal REST services and web applications, but the services do get a significant amount of traffic and are required to be a cut above when it comes to reliability and stability.  The success we have had in the amount, quality and cadence of releases can be attributed to our process.


Tuesday, April 14, 2015

Python Idiom: Collection Pipeline

A common implementation involves calling a set of functions sequentially with the results of the previous call be passed to the subsequent call.

from math import sqrt, ceil
def transform(value):
   x = float(value)
   x = int(ceil(x))
   x = pow(x, 2)
   x = sqrt(x)
   return x

This is less than ideal because it's verbose and the explicit variable assignment seems unnecessary.  However, the inline representation may be a little tough to read, especially if you have longer names, or different fixed arguments.

from math import sqrt, ceil
def transform(value):
   return sqrt(pow(int(ceil(float(value))), 2))

The other limitation is that the sequence of commands is hard coded.  I have to create a function for each variant I may have.  However, I may have a need for the ability to compose the sequence dynamically.

One alternative is to use a functional idiom to compose all the functions together into a new function.  This new function represent the pipeline the previous set of functions ran the value through.  The benefits are that we extract the functions into their own data structure (in this case a tuple). Each element represents a step in the pipeline.  You can also build up the sequence dynamically should that be a need.

Here we use foldl aka reduce and some lambda's to create the pipeline from the sequence of functions.

fn_sequence=(float, ceil, int, lambda x: pow(x, 2), sqrt)
transform = reduce(lambda a, b: lambda x: b(a(x)), fn_sequence)
return transform('2.1') # => 3.0

Now I have a convenience function that represents the pipeline of functions.  We can extrapolate this type of pipeline solution for more complex and/or more dynamic pipelines, limited only by the sequence of commands.  The unfortunate cost to this idiom is the additional n-1 function calls created by the reduce when composing the sequence of functions together.  Given this cost, and the cost of function calls in Python is would probably be better to use this in cases where there will be additional reuse of intermediate or final forms of the composition.


Friday, March 6, 2015

Python: unittest setUp and tearDown with a ContextManager

Python unittest follows the jUnit structure, but is extremely awkward.  One of the more awkward portions are the use of setUp and tearDown methods.  Python has an elegant way of handling setup and teardown, it's called a ContextManager.  So let's add it.

import unittest
from functools import wraps
from contextlib import contextmanager

def addContextHandler(fn, ctx):
    def helper(self, *a, **kw):
        if not hasattr(self, ctx):
            return fn(self, *a, **kw)

        with getattr(self, ctx)():
            return fn(self, *a, **kw)

    return helper = addContextHandler(, 'contextTest')

class TestOne(unittest.TestCase):

    def contextTest(self):
        print "starting context"
        print "ending context"

    def testA(self):
        print "testA"

    def testB(self):
        print "testB"

    def testC(self):
        print "testC"
if __name__ == "__main__":

Or if you want to play nice with unittest.TestCase and not modify directly you can subclass it.

import unittest

class MyTestCase(unittest.TestCase):
    ctx = 'contextTest'

    def run(self, *a, **kw):
        if not hasattr(self, self.ctx):
            return super(MyTestCase, self).run(*a, **kw)

        with getattr(self, self.ctx)():
            return super(MyTestCase, self).run(*a, **kw)

Thursday, September 4, 2014

Python Idiom: First Occurence

Finding the first occurrence in a collection of data is a common problem. 

# Non Idiomatic
found_line = None
for line in logfile:
   if regex.match(line):
      found_line = line
return found_line

Compared to

# Idiomatic
return next(line for line in logfile if regex.match(line), None)


# Idiomatic (thanks to Suresh V)
from itertools import dropwhile
return next(dropwhile(lambda x: not regex.match(x), logfile), None)

The idiomatic solution is not only more compact, but it reads better.   It also gives the interpreter the opportunity to be more efficient in how it allocates memory due to the generator expression


Saturday, August 30, 2014

Singletons Reconsidered


Don't make it a global, use it only for stateful resources, and don't use them if you can't implement them properly due to language or ability. Add management controls to the interface so that you can control the behavior of the Singleton in cases like testing, debugging or resetting.


 Everyone by now knows the arguments.


The typical complaint is that singletons are global and that makes them hard to test and be in tests.  In most languages we can address those issues directly.

  1. Don't make the Singleton global, make it scoped to the Singleton class or module.
  2. Support management controls like a reset or clear method.
There is no reason to make a Singleton global. You should be able to import the class that will return the Singleton. Ideally you make the Singleton truly instantiate with the first constructor call. Any other constructor call would just be returning the already constructed object.  For all usages it becomes just another constructor call that happens to return the same object.

The Singleton should persist state, which does make it harder to test. However, if you add management controls then the Singleton poses no testing problems.  If you add a reset or destroy to the Singleton class is completely testable.

Hidden Dependencies

If it's no longer a global that means you have an explicit import or include.  It's inclusion is no longer assumed, and as a result you know if a given module uses the Singleton because it has the import.  The dependencies are no longer hidden, they are explicit and clear.

Violates the Single Responsibility Principle

No it doesn't.  At it's core SRP refers to cohesion and coupling.  Two things that aren't cohesive should not be coupled together because changes in one should not impact the other.  However if they are in the same class you have coupled them together so when either responsibility changes the entire class has to change as well.  This is tight coupling.

This has nothing to do with an object being a Singleton unless it is somehow exporting it's ability to be a Singleton (like a meta class, mixin or template class might).  Being a Singleton is a property of the class, that doesn't mean the behavior is primary, ie. The intent of the class is not to provide Singleton behavior out to other objects. Since the Singleton behavior is encapsulated and not exposed SRP remains intact.

Doesn't Work Right in Language X

Yeah, well that's self explanatory.  Don't use language X or if you have to use language X then don't use Singletons. 


Now that is a real argument.  Yes Singletons can suck in a threaded application unless the Singleton has  semaphores or mutexes to create the appropriate critical sections.  Yes it's hard to get right, and you may not know you didn't get it right until that weird bug happens in production. HOWEVER, that is an ongoing risk of threaded programming regardless of Singleton usage.  Singletons might make it a little more likely you screw it up, but it's not going to be in some novel way.

This risk is also completely mitigated in the case of a read only Singleton, such as a Config object.

Singletons Done Right IMO

Okay, so I'm not a hotshot programmer.  I consider myself a decent bordering on good programmer.  With all those caveats upfront,  here is how I do Singletons.

Override Instantiation to return the same instance always, or the same instance given the constructor arguments as a unique key.

Make the actual instantiation of the Singleton lazy. So it just does the right thing regardless of actually creating the object the first time underneath the covers, or simply returning the same object that already exists.

Always provide an explicit reset or destroy for the Singleton to facilitate testing.

Sunday, April 20, 2014

Creating a local email archive with: offlineimap and procmail

I synchronize my imap folders to maildir on my local laptop often so I can both have access to my email without a network and utilize my preferred search and email clients.  In order to facilitate how I use email I keep a local archive which created and filtered by procmail.

Here is an approximation of my crontab (cron doesn't start a shell, so I put most of the commands in a script):

% crontab -l
0-59/5 9-18 * * * $HOME/bin/syncemail 

Here is the syncemail script:


offlineimap 2>&1 | logger -t offlineimap

for i in `find $MAIL/Disney -type f -newer $PROCMAILD/log `; do
  cat "$i" | procmail

and here are the relevant portions of my .procmailrc:

ARCHIVEBY=`date +%Y-%m`
MKARCHIVE=`test -d ${ARCHIVE} || mkdir -p ${ARCHIVE}`

# Prevent duplicates
:0Wh: $PMDIR/msgid.lock
| /usr/bin/formail -D 100000 $PMDIR/msgid.cache



Sunday, March 23, 2014

REST: POST vs PUT for Resource Creation

Questions often come up about whether to use PUT or POST for creating resources in REST APIs.

I've found both are appropriate in different situations.


PUT is best used when the client is providing the resource id.
PUT https://.../v1/resource/<id>
Per spec PUT is for storing the enclosed entity "under the supplied Request-URI".  This makes it the ideal HTTP method for use when creating or "storing" a resource.  Only when all the requirements for PUT can't be met should POST be considered.   The perfect example of when the client cannot provide the resource id.


POST is best used when the client doesn't know the resource id a priori.
POST https://.../v1/resource
POST shouldn't be the first choice for resource creation is because  it's really more of a catchall method.
"The actual function performed by the POST method is determined by the server and is usually dependent on the Request-URI."
It doesn't require anything be created, or made available for later.
"A successful POST does not require that the entity be created as a resource on the origin server or made accessible for future reference. That is, the action performed by the POST method might not result in a resource that can be identified by a URI."

Wednesday, February 12, 2014

Python: Aggregating Multiple Context Managers

If you make use of context managers you'll eventually run into a situation where you're nesting a number of them in a single with statement.  It can be somewhat unwieldy from a readability point of view to put everything on one line:

with contextmanager1, contextmanager2, contextmanager3, contextmanager4:

and while you can break it up on multiple lines:

with contextmanager1, \
           contextmanager2, \
           contextmanager3, \

sometimes that still isn't very readable.  This is more of a problem if you're using the same set of context managers in a number of places.  Ideally you should be able to put the context managers in a variable and use that with however many with statements need them:

handlers = (contextmanager1, contextmanager2, contextmanager3, contextmanager4)
with handlers:

Of course this doesn't work because handlers is a tuple, not a context manager. This will cause with to throw a exception.  What you can do is create a context manager that aggregates other context managers:

from contextlib import contextmanager
import sys

def aggregate(handlers):
    for handler in handlers:
    err = None
    exc_info = (None, None, None)
    except Exception as err:
        exc_info = sys.exc_info()

    # exc_info get's passed to each subsequent handler.__exit__
    # unless one of them suppresses the exception by returning True
    for handler in reversed(handlers):
        if handler.__exit__(*exc_info):
            err = False
            exc_info = (None, None, None)
    if err:
        raise err

So now you can aggregate all the context managers into one and use that one in the with statement:

handlers = (contextmanager1, contextmanager2, contextmanager3, contextmanager4)
with aggregate(handlers):

You can build up the list of context managers however you want and use aggregate when using them in a with statement.


Friday, January 17, 2014

Python Metaprogramming: A Brief Decorator Explanation

A brief explanation on how to think about Python decorators.  Given the following decorator definition:

def decorator(fn):
    def replacement(*a, **kw):
    return replacement

This usage of the decorator

def fn():

is functionaly equivalent to

def fn():
fn = decorator(fn)

Note that fn is not being executed.  Instead decorator is being passed the callable object fn, and is in turn returning a callable object replacement which is then bound to the name fn.  Whether or not the original callable ever gets called is up to decorator and the replacement callable.

Another thing to consider, which often causes people problems, is the timing of the decorator's execution, which is to say during the loading of the module.  If you want to execute a particular piece of logic during fn's call, then that logic needs to be placed in the replacement callable, not in the decorator.

So now that everything is clear it's obvious that

def fn():

Is really just

def fn():
fn = decorator(make_decorator(args)(fn))

Which means the first decorator in a stack is the last to be evaluated.