Penetration testing is a professional service, but for customers looking to engage a pen testing provider it sits in an awkward category of purchase. Unlike most professional services, where you can assess capability through a proposal, a reference call, provider visibility in the marketplace, or a few hours of scoped work, the quality of a pen test is genuinely difficult to evaluate before you buy it. And to some degree, it remains difficult to evaluate even after you receive the report.
This is the core problem buyers face. You are commissioning work whose depth and rigour you cannot directly observe, from providers whose internal methodologies you have little visibility into, to assess risks you may not fully understand yet. Most buyers, understandably, end up falling back on the signals that are easy to read: price and delivery timing. These are measurable, but they are also almost entirely uninformative as proxies for quality.
This guide is an attempt to give you better signals.
What Actually Separates Good Providers
Methodology
The most consequential question you can ask a prospective provider is what methodology they will be testing against. There is a significant difference between a provider who tests against a defined, internationally recognised standard and one who tests against their own internal checklist or only ever to a decided budget, or whose methodology begins and ends with the OWASP Top 10.
For web application testing, the OWASP Application Security Verification Standard (ASVS) is the benchmark worth asking about and aiming for, while MASVS is its counterpart for mobile applications. OSSTMM and PTES are other references that indicate a provider is working from a documented, repeatable framework or methodology rather than a bespoke approach that is difficult to audit or compare. A provider who can point to the specific standard they are using, and explain why it is appropriate for your target, is telling you something important about how they operate.
The OWASP Top 10 is a useful awareness document, but it is not a testing standard. It describes the most common vulnerability categories at a point in time, and covering it tells you relatively little about the depth or completeness of a given engagement. Top 10-focused testing is not inherently bad, but providers who present it as a methodology are often offering something closer to a structured scan than a genuine manual, audit style penetration test.
Testing Team
A quality provider doesn’t staff an engagement randomly or by availability. The testers assigned to your work should have been chosen deliberately, with their specific experience, skills, and certifications matching the nature of what’s being tested. A provider who can articulate how that selection is made, and what criteria drive it, is demonstrating something meaningful about how they operate.
It is also worth understanding that within a serious testing practice, a tester works on one engagement at a time. This is not universally the case in the market. A tester who is moving between multiple active engagements is not giving yours the sustained attention that good manual testing requires. It is a reasonable question to ask, and how a provider answers it tells you something about how they think about quality.
Source Code Access
If a provider is proposing to test your web application without access to source code, it is worth asking why. White-box or grey-box testing, where testers have access to the codebase, is considerably more effective than black-box testing for most engagement types. It allows testers to identify entire classes of vulnerability across the codebase rather than discovering individual instances through surface-level probing, and it allows them to provide more targeted and reproducible proof of concept details. It also makes better use of your budget: more findings, more accurately characterised, in the same time window.
Black-box testing has its place, but defaulting to it without a clear rationale usually indicates either a preference for speed and/or budget over depth, or a tester who is relying primarily on automated tooling.
Granting a provider access to your source code is one of the more significant decisions in the engagement. For SaaS businesses, whose codebase is their core asset and most valuable IP, sharing it with a third party carries real responsibility on both sides; a quality provider understands this without being told. They will raise an NDA at the appropriate point in the conversation, and how they respond when this comes up early is itself a signal: willingness to engage, discuss, and sign is what you should expect. Reluctance or pushback is worth noting.
The same applies to how source code is actually accessed during testing. A good provider will be able to describe their approach without hesitation, whether that is isolated environments, controlled access, or another arrangement suited to your situation. The specifics matter less than whether they have a considered process at all. If the question catches them off guard, that is an answer in itself.
Scoping and Environment
The scoping conversations that happen before testing begins is itself a useful signal. A provider who is asking detailed questions about your testing environment, whether it closely mirrors production, what the key differences are and how to handle those differences, is one who understands why this matters. Findings from a test conducted against a substantially different environment can introduce both false positives and false negatives, which undermines the value of the whole engagement when what you are seeking is reliable information and assurance about the security of your targets.
The testing scope should also be broad enough to cover all key internet-facing targets, but not so broad as to include unnecessary systems or targets which may not be relevant or even testable.
Ultimately, the pre-test conversation should feel substantive, not perfunctory.
Report Quality
The pen test reporting is ultimately the deliverable you are buying. Ask to see a sample (redacted is fine) before committing. What you are looking for is evidence that the reporting was prepared by someone who understands both the technical detail and the business context for the engagement. Every finding should include a clear severity rating using an accepted industry system (such as DREAD or CVSS), a clear description of the vulnerability and how it was found, proofs of concept that allows your team to reproduce and verify the findings, and a remediation recommendations that your teams can actually act upon. If the sample reporting reads like an automated scanner output dressed up with headings, that is an important signal.
What Happens After the Report
A provider who hands over the report and considers the engagement closed is not offering you a complete service or best value. Remediation support, re-testing of fixed issues, and assurance documents that you can share safely with non-technical stakeholders (governance boards, auditors, and customers or prospects) are all part of what a mature engagement looks like. The assurance reporting in particular, confirming what was tested, when, and by whom, without disclosing sensitive finding details, is what moves you from “trust us” to “here’s the evidence”. If this does not appear in the scope of what a provider offers, it is worth asking about that directly.
For a more detailed walkthrough of the full testing process and what each stage should look like, our article on what actually happens during a web application pen test covers it in depth.
The New Zealand Context
The New Zealand market is small, and the pen testing provider landscape reflects that. Fewer providers means less competitive pressure on quality, fewer points of comparison for buyers, and less institutional knowledge about what good looks like. This is not a criticism of the market, it is just the reality of operating in a smaller ecosystem.
One consequence is that internationally recognised certifications and standards matter more in this context, not less. In a larger market, reputation and referrals do a lot of the qualification work. In New Zealand, where the network is thinner and the pool of providers smaller, the certification question becomes a more important piece of due diligence. Ask whether testers hold relevant certifications (OSWE, CREST, and similar credentials are worth asking about specifically), and whether the testing methodology is aligned to a standard that is internationally recognised and auditable. These are not bureaucratic checkboxes, they are one of the few objective anchors available to a buyer who cannot directly evaluate the work itself.
Also don’t be too swayed by the size and market share of a provider, or the fact that they have multiple offices both in New Zealand and overseas. The quality signals described above apply regardless of the size of a company or where they are based.
What Confidence Actually Looks Like
The outcome you are aiming for is not a report you can file. It is a position you can defend.
If your organisation commissions a penetration test and the result is a report that sits in a shared drive, you have expended time and money without building anything durable. However, if the result provides a clear picture of your actual security posture, evidence of the testing methodology and coverage, a remediation trail your team has worked through, and a document you can hand to an auditor or a prospective customer with confidence, that is a different outcome entirely.
The buyers who get the most from penetration testing are the ones who treat it as a programme rather than a point-in-time event. That means choosing a provider who is thinking about your ongoing security posture, not just this engagement, and who is willing to help you build the kind of continuous assurance that makes each round of testing build on the last.
The right question to take into a provider conversation is not “how much does a pen test cost?” but “what will I be able to demonstrate at the end of this, and to whom?” A provider who can answer that question clearly is one worth talking to further.
We wrote this guide because we think it reflects how a good engagement should work, and because we’re comfortable being held to it. If you’d like to put the framework above to use, we’re a reasonable place to start.
