Testing Strategy Development
Structured multi-phase engagement; deliverable is a complete testing strategy for your system.
The Problem
Your model is in production. You have error rates, latency metrics, maybe A/B results — but you don’t know if the system is doing what it’s supposed to do in the contexts that matter. You’re testing for correctness, not safety.
Probabilistic software presents real testing and measurement problems. From data and model drift to fairness, bias, security, toxicity, and raw performance, the list of questions you need to answer keeps growing. How do you prioritize them? How will you manage your resources? What happens when you don’t know how to approach one of these critical areas within your specific system?
The Offering
I offer my experience and knowledge to your team as a means of building out a substantive testing experience. Together, we will identify better questions to target while ensuring you have the data required to answer them. We will think through more than just the “what” of testing your system and into the “how” can you use every signal to take action to improve it. Much like the hazard analysis, this occurs in multiple phases.
The scoping phase
We will spend the first meetings and exchanges of information getting to know each other and allowing me to understand how complex the system is. Based on this, you will receive an analysis document describing what I think are the most relevant areas of interest for testing. From here we will decide if we want to work together to build out the strategy and tests for evaluating those areas of interest, or if you would like to proceed on your own.
The strategy development phase
If we continue to work together, I will collaborate with your personnel to identify if the system is designed to provide the right signals for analysis. If not, then our first order of business will be to set out a plan to build that functionality into the system. While that’s happening, we will also address what we can reasonably test versus what we wish we could test but we can’t, whether it’s due to data access, instrumentation, or that the industry does not have the knowledge necessary to testing that signal. I will then draft a strategy for testing the system and work with you to finalize it.
Recommendations and debrief
After the strategy has been developed, I will provide you with a list of prioritized recommendations based upon my finding. We will discuss how you plan to implement them and whether you feel you need to put me on retainer to address questions as they arise.
What’s Included
- System Testing Analysis Report
- A testing strategy document that outlines the plan, goals, and requirements for testing the system
- Recommendations implementing tests and continuous monitoring
- A presentation to your leadership (additional fees for traveling to location)
Pricing
Pricing will be set based on the scope of the project. There is a $20,000 minimum for the full project, plus travel expenses, to ensure I have enough time for all three phases (even for small projects). Larger projects may increase in cost up to $60,000 or higher. System complexity, scale, and the criticality of safety within its usage will factor heavily into the cost structure. No matter the size of the project, you will receive a thorough analysis of your system and the recommendations for how to navigate tradeoffs between safety and value.