What an agency had to prove to carry the Prevouched mark.
The full, published specification. Five pillars, weighted criteria, evidence requirements, tier floors, and the conditions under which the mark is suspended or revoked. Written so a buyer can inspect a specific team, and so an applicant knows exactly what will be scored.
You landed here from a badge or the directory.
This page tells you exactly what an agency had to prove to carry a Prevouched mark, and the events that will strip that mark. The five pillars each cover a way offshore engagements typically fail. The tier floors below show the minimum score required for Verified, Backed, and Managed.
Read this before you apply.
Every criterion lists what we look at, the evidence you will need to produce, the bar you must clear, and the patterns that count against you. If most items describe how your team already works, the application will be quick. If not, the rubric itself is the gap analysis.
Shorthand used throughout: is the five-level scoring scale. is the weighted total across pillars, scored 0–100. is the minimum a pillar or composite must reach for a given tier. Hover, tap, or focus any dotted term for a plain-English definition.
Terms used on this page.
Every dotted term below is also hoverable inline throughout the rubric. Written for readers who have never worked with a vetting scorecard before.
- Disqualifying evidence on a single criterion.
- Present but not sufficient. Must be fixed before the mark can issue.
- The published minimum. Evidence is present and a reviewer can verify it.
- Clearly above the bar on substance, consistency, and how recent the evidence is.
- Best-in-class against the reference examples the reviewer panel scores from.
- The minimum evidence a criterion must show to score L2 (Meets bar).
- The weighted total score across all five pillars, on a 0–100 scale.
- The minimum scores required to issue a given tier.
- A single L0 score anywhere. Blocks the mark at every tier.
- The share a pillar contributes to the composite. All five weights sum to 100%.
- The share a criterion contributes to its pillar score. Criterion weights inside one pillar sum to 100%.
- Reference examples reviewers score against so scores mean the same thing across teams.
- A fresh review of the pillars affected by a change or complaint.
- The annual confirmation that the evidence on file is still current.
- The mark is not currently valid. Verification page shows Suspended until re-vetting concludes.
- Re-attestation window closed without a submission. Directory listing is demoted.
- The mark has been withdrawn. The agency may re-apply after a 90-day cooling-off period.
Five levels. One descriptor each.
Every criterion in every pillar is graded on the same five-level scale. Reviewers anchor each score to the descriptor and a reference example from the calibration set.
Critical-failure rule · Any criterion graded L0 blocks issuance at every tier.
Pillar scores roll up. Tiers gate the mark.
Each pillar score is the weighted average of its criteria, normalized to 0–100. The composite is the weighted sum across pillars. Tier floors are published; the underlying calibration keys stay internal.
For the buyer: the team met every pillar's minimum bar under reviewer inspection. Evidence is on file. Runs on the agency's own delivery process.
All pillars ≥ 60. No criterion below level 2.
For the buyer: a named US-based Prevouched liaison joins your weekly call, reads the same updates you do, and is the escalation contact when something is off.
All pillars ≥ 70. Past-work and References pillars ≥ 80.
For the buyer: Prevouched is inside the contract on the specific engagement, with defined obligations. Reserved for opt-in projects that pass legal review.
All pillars ≥ 80. Security pillar ≥ 85. Legal review gated.
A mark that cannot be taken away isn't trust.
The rubric is point-in-time. The status is live. These are the events that move an agency from Approved to Suspended, Lapsed, or Revoked. And the action Prevouched takes when each fires.
A client complaint corroborated by evidence. Missed SLA, undisclosed subcontracting, material misrepresentation.
Suspension within 5 business days pending re-vetting on the affected pillars.
Confirmed data incident, IP dispute, or material breach of the engagement contract.
Immediate suspension. Revocation on confirmation. Directory listing removed.
Annual re-attestation not completed within the published window.
Status changes to Lapsed on the verification page. Directory listing demoted.
Routine re-vetting returns a composite below the tier floor.
Tier reduced or mark revoked. Agency may re-apply after a 90-day cooling-off period.
How two real-looking applicants come out of the rubric.
Pillar scores below are illustrative. Composed from the actual weights and tier floors so the arithmetic is reproducible. Both applicants clear the L0 critical-failure rule.
Even profile, strongest on references.
| Pillar | Score | Weight | Contrib. |
|---|---|---|---|
| 01 · Technical screen | 78 | 25% | 19.5 |
| 02 · Past-work review | 82 | 20% | 16.4 |
| 03 · Reference checks | 88 | 20% | 17.6 |
| 04 · Communications | 74 | 20% | 14.8 |
| 05 · Security & process | 70 | 15% | 10.5 |
| Composite | 78.8 | ||
| Tier | Verified |
High technical score, security/comms gap.
| Pillar | Score | Weight | Contrib. |
|---|---|---|---|
| 01 · Technical screen | 91 | 25% | 22.8 |
| 02 · Past-work review | 84 | 20% | 16.8 |
| 03 · Reference checks | 80 | 20% | 16.0 |
| 04 · Communications | 62 | 20% | 12.4 |
| 05 · Security & process | 55 | 15% | 8.3 |
| Composite | 76.2 | ||
| Tier | Verified |
Applicant A composites at 78.8 and earns Verified. Applicant B composites at 76.2, above the Backed composite threshold. But the Communications and Security pillar scores sit beneath the Backed and Managed floors, so the panel issues at Verified with remediation on pillars 04 and 05 before a tier upgrade is considered.
Technical screen
We score engineering judgment, not tool familiarity. Two senior engineers read the team's real production code and compare it against a shared set of reference examples the panel has scored before.
Readability, naming, module boundaries, idiomatic use of language and platform.
Consistently readable by a new reviewer. Modules have obvious seams. No load-bearing cleverness.
- Two production repositories (read-only access)
- One self-selected exemplar with reviewer commentary
- The same scaffold copy-pasted across services instead of shared
- Untyped inputs and outputs in a typed language
- Style and conventions change within a single repository
Choice of patterns relative to the problem; explicit handling of failure modes; cost-of-change discipline.
Decisions are defensible and reversible. Trade-offs are named, not avoided.
- Architecture write-up for one shipped system
- One decision the team would now make differently
- Split into many services with no written reason for the split
- No documented approach to retries, duplicate requests, or partial failure
- Abstractions built before a second use case exists
What is tested, how, and at what cost. Test selection, flakiness controls, CI signal quality.
Critical paths have tests that fail loudly when broken. CI signal is trusted, not muted.
- Coverage of one critical path with rationale
- CI configuration and last 30 days of run signal
- Test suites made up mostly of snapshot tests
- Failing tests routinely dismissed by re-running the pipeline instead of investigating
- No integration tests covering paid or revenue-critical features
Review depth, turnaround, and the substance of the discussion in pull requests.
Reviews engage with substance. Disagreement is documented. Authors are not their own approvers.
- Sample of 10 recent PRs across the team
- Stated review SLA and adherence
- More than half of merged pull requests approved with only "looks good" and no substantive comment
- No record of a reviewer blocking or requesting changes in the last quarter
Borderline cases only. A 60-minute pairing on a small, real problem; we watch reasoning, not typing speed.
Engineer states assumptions, validates them, and produces a working sketch with named trade-offs.
- Reviewer transcript and rubric notes
- Engineer will not explain their thinking out loud
- Cannot adapt when the reviewer introduces an intentional ambiguity
Past-work review
Screenshots and mockups are not evidence. We grade shipped work you can open, outcomes the team can prove they caused, and how long their clients stay.
Real, accessible work. Live URLs, repos, or recorded walkthroughs. Not mockups.
Two reviewers can independently verify the team's contribution.
- At least three accessible deliverables from the last 24 months
- NDA cited as the reason no work at all can be discussed or shown under reviewer NDA
- Only mockup PDFs, no shippable artifact
What changed because the team was involved. Numeric where possible, narrative where not.
Outcomes are named, sourced, and tied to the team's actual work.
- Before/after metrics or client-attested narrative
- Identification of the team's specific scope
- Outcomes claimed by the team that happened before their engagement started
- Percentage improvements quoted with no baseline number
How long clients stay and why. Renewals and scope expansions count more than headline logos.
Median engagement >9 months; renewals on at least two of the last five.
- Engagement timeline for the last five clients
- Renewal and expansion record
- Repeated short pilots (around six weeks) that rarely convert into longer work
- Client departures cluster around one specific team lead
Evidence that the team can say no, re-scope, and protect the engagement from drift.
Team can cite at least one engagement where they pushed back and the relationship survived.
- One example of a refused or re-scoped request and the reasoning
- Every engagement grew in scope without ever re-negotiating price
- No record of a difficult scope conversation with any client
Code, docs, and credentials transfer cleanly at engagement end.
A non-author can stand the system up from the handoff alone.
- Sample handoff package or runbook from a prior engagement
- Credentials or code withheld at engagement end to pressure renewal
- No documented checklist for ending an engagement cleanly
Reference checks
We speak with prior clients on the record using the same set of questions for every agency, and we keep the notes on file. References an agency cannot or will not provide are themselves a signal.
Number, recency, and seniority of references actually reached.
Three reachable references within the last 24 months; two completed structured calls.
- Three references contacted; at least two with decision-maker authority
- Only one reference is actually reachable
- All references come from a single industry while the agency claims to serve many
Did the team ship what was promised, in roughly the time and cost they said?
Majority of references confirm delivery on substance, with reasonable variance on time and cost.
- Reference rating on delivery vs. scope, with examples
- Multiple references describe budget or timeline overruns that the team did not raise until after the fact
How the team behaved when something went wrong. Escalation speed, transparency, recovery.
References can name at least one incident handled with proactive disclosure.
- Reference recounting of one incident or near-miss
- References cannot recall a single difficult moment on the engagement
- Long periods of silence followed by unexpected invoices
Two questions asked verbatim. Hesitations are scored.
Unprompted yes from at least two of three references.
- Direct quotes; pauses and hedges noted
- An initial yes that becomes a no once the reference is asked a follow-up question
Whether the references offered are willing to discuss weaknesses, not just strengths.
Each reference identifies a real area to improve.
- At least one substantive weakness named per call
- Every reference is uniformly positive, suggesting they were coached or hand-picked
Communications assessment
Most offshore engagements break down on written communication before they break down on code. We test writing, calls, scope conversations, and how the team raises bad news.
A 500-word async update on a synthetic engagement scenario.
Lede first, decisions named, asks unambiguous, no buried risk.
- Submitted sample graded against the calibrated reference set
- Status updates that hide the actual blocker several paragraphs in
- Long unstructured paragraphs with no headings, bullets, or clear asks
Functional English on the call, on the page, and under disagreement.
Comprehension and expression are not the bottleneck of the conversation.
- Live call segment; reviewer scoring
- Rehearsed answers that fall apart when the reviewer asks a follow-up question
Stated SLA for async response and adherence over a one-week observation window.
First response within stated SLA on all three; substantive response within one business day.
- Logged turnaround on three test messages across the window
- First reply is an emoji acknowledgement with no substantive follow-up
Can the team scope realistically and push back on a request that doesn't fit?
Team narrows scope, names trade-offs, proposes a smaller first delivery.
- Synthetic intake scenario; reviewer transcript
- Team agrees to anything the buyer proposes
- Team quotes a price without asking a single clarifying question
How the team raises a slip, an overrun, or a quality issue.
Issue stated plainly, with impact, options, and a recommended path.
- Roleplay segment; reviewer notes
- Bad news softened with so many qualifiers the actual issue is unclear
- News only surfaced after the client asks directly
Security and process baseline
This is not a formal security audit or a SOC 2 certification. We confirm the team handles client data, access, and contracts at a level a serious buyer can defend to their own security team.
Where client data lives, who can read it, how long it is retained, how it is deleted.
Written policy exists, is followed in practice, and survives reviewer questioning.
- Data-handling policy; sample DPA
- Production data copied onto personal laptops
- No process for deleting client data when the engagement ends
SSO, MFA, least-privilege, joiner and leaver discipline.
MFA enforced; leavers off all systems within one business day.
- Identity-provider configuration screenshot; leaver checklist
- Shared administrator accounts used by multiple people
- Former employees still have access to client repositories
Source control, branch protection, deploy controls, change management.
Mainline protected; deploys are reviewed; rollback is rehearsed.
- Repository protection rules; deploy pipeline overview
- Engineers push directly to the main branch with no review
- Manual edits to production systems with no audit trail
A documented process, a recent test of it, and at least one post-mortem on file.
Process exists, has been used, and led to a documented change.
- IR runbook; one redacted post-mortem
- Team reports no incidents have ever occurred, which is not credible at their scale
MSA quality, IP assignment, subcontractor disclosure, insurance.
IP cleanly assigns to the client. Subcontractors are disclosed. Insurance is current.
- Sample MSA; certificate of insurance; subcontractor policy
- Ambiguity about who owns the delivered code
- Subcontractors used but not disclosed to the client
- Professional liability insurance has lapsed
Reviewers train on the same reference set.
Quarterly calibration sessions re-anchor scoring against shared exemplars at L1, L2, L3, and L4 in every pillar. Drift between reviewers is measured and tracked.
One appeal per cycle, heard by a second panel.
An agency may contest a scoring decision once per vetting cycle. The re-review is conducted by reviewers who did not sit on the original; their score stands.
Rubric versions are dated and preserved.
Each agency record states the rubric version it was issued against. A new rubric does not retroactively re-grade prior cohorts. Re-attestation does.