Project Management
Programming for Data Science
What is Project Management?
Broadly speaking, project management (PM) is a class of rational methods and protocols for organizing the labor of complex activities in order to achieve specific outcomes that are on time and on budget.
PM is important for data science programming because programming means building software, and software projects often invovle teams building complicated products for clients.
PM is an established field that is used for many activities, from military operations to business processes.
It is a profession in which you can be certified as a CPM.
It is a field of study with an established and large body of knowledge known as PMBOK.
The history of PM dates to the early 20th century when scientific techniques were applied to management and manufacturing.
One of the earliest tools of project management is the Gantt Chart, a tool to visualize the relationship between many labor and production components.
PM becomes highly developed in the 1950s and ’60s in the post-war era.
Think of the complexity of the Manhattan Project or NASA’s moon shot.
From the beginning, PM has been used to support critical operations: weapons systems, air traffic control, financial systems, health care, etc.
Increasingly, software became involved in all of the operations.
But in the 1960s, the US Department of Defense’s study of software problems found that:
\(47\%\) of software delivered could not be used, as it usually didn’t meet requirements.
\(29\%\) of funded software was never delivered, as it was usually canceled due to cost/schedule overruns.
\(19\%\) of software was useful after extensive rework, and it usually cost \(36\) times more to fix problems after release.
Eventually, PM was adapted to software development, to address these issues.
Software development is hard for many reasons:
Hardware and software are inherently complex.
Code is imperfect and error-prone.
Code is always an interpretation of specifications that are always underspecified.
Code is underspecified in two senses:
First, there is the inherent inadequacy of language to represent reality.
“The map is not the terrain”
Second is the fact that people don’t know what they want even when they say they do.
That is, reality itself changes.
The slipperiness of writing code that achieves our goals is captured in the Danish proverb:
The operation was a success, but the patient died.
Paradigms of Software Development
To address these issues, many paradigms of the software development life cycle (SDLC) have been introduced over the years.
Two popular approaches are the Waterfall and Spiral methods.
The Waterfall Model
The Waterfall model represents softwawre development as a linear process that begins with requirements gathering and formalization and ends with maintaining software products.
The metaphor is that everythings descends from the requirements.
The Iron Triangle
One of the limitations of the waterfall model is that it relies heavily on getting the requirements right in the first place.
The Iron Triangle concept shows the effect of changing user requirements on a project.
If users want to add something to a project — a new feature, for example — this means that scope has increased.
Therefore, either time or resources must expand as well to allow for the change.
The Spiral Model
The Spiral method addresses the problem of changing requirements by being more iterative.
It introduces the client at points in an ongoing cycle.
Both methods have the virtue of being rational and comprehensive, defining all the things involved in software development.
But they are very linear.
Even the Spiral method is a linear process. The same linear sequence —
Objective Identification
-> Alternate Evaluation
-> Product Development
-> Next Phase Planning
— is repeated successively until the project is completed.
Engineering and Human Labor
What these methods have in common is a faith in the power of planning and engineering to achieve results.
Planning assumes the ability to predict accurately how things will unfold and that people understand and follow rules.
However, life is not always linear.
Many factors interfere to alter the movement from requirements to product.
Client and Developer
These relationship between initial requirements and final product are embodied in the relationship between client and developer.
But notice the absence of the client in the design, building, and validation phases in both models.
This is partly the result of mitigating the mission creep introduced by the client, which runs afoul of the iron triangle.
But effective software design often requires a more involved role for the client.