Answer-first summary

Call center quality assurance evaluates customer interactions against a defined standard to improve service quality, compliance, and satisfaction. About 95 percent of call centers run QA, but most score only about four calls per agent per month (SQM Group), and 81 percent of those traditional scores do not correlate with CSAT (SQM Group). The fix is a short, outcome-weighted QA scorecard (resolution, compliance, empathy, efficiency, customer effort), calibrated against real CSAT, with AI scoring 100 percent of calls instead of a sampled few. The sections below show how to build it.

Most call center quality assurance programs have a quiet problem: they barely look at the work, and the part they do look at does not predict whether customers are actually happy. Here is the math that should bother every operations leader. According to CX research firm SQM Group, the long-standing industry standard is to randomly score about four calls per agent, per month. If that agent handles a thousand calls in a month, the QA team is forming its entire opinion of that agent from less than half of one percent of the work. Worse, SQM's research found that 81 percent of traditional QA scores did not correlate with the customer's own satisfaction rating. You are measuring four calls out of a thousand, and the score you produce does not track the one outcome that pays the bills. Roughly 95 percent of call centers run a QA program, and almost all of them run this same broken version of it. Let us fix the scorecard.

What call center quality assurance actually is

Call center quality assurance (QA) is the practice of evaluating customer interactions against a defined standard, then using what you find to improve service quality, compliance, and customer satisfaction. In a healthy program, QA is not a compliance tax or a way to catch agents doing something wrong. It is a feedback engine: it tells you what is working on your calls, what is breaking, and where coaching will move the needle. This is also why the practice is often called call center quality monitoring or, more broadly, customer service quality assurance, since the same discipline now spans voice, chat, and email.

The reason most QA programs fail to deliver that feedback is rarely effort. It is three structural flaws that compound.

Why traditional QA fails

Three problems sit underneath nearly every underperforming quality program. They are structural, not personal, which is why working harder inside the old model does not fix them.

1. The sample is too small to be true

Four calls a month cannot represent a thousand. Random sampling at that rate misses your edge cases, your hardest calls, and your best ones. You end up coaching agents on a handful of interactions that may not be representative of anything. Agents know this, which is why only about 17 percent of them believe QA scoring actually improves customer satisfaction (SQM Group). When the people being measured do not trust the measurement, the program is dead on arrival.

2. The scorecard measures the wrong things

Most QA forms are internal checklists: did the agent use the greeting, did they say the closing script, did they tag the ticket correctly. Those are easy to score and easy to defend. They are also weakly connected to whether the customer left satisfied. That disconnect is exactly why 81 percent of traditional QA scores fail to correlate with CSAT. If your scorecard rewards process compliance and the customer rewards resolution and empathy, you are optimizing for the wrong target.

3. The feedback lags

A monthly QA review tells an agent about a call they handled three weeks ago. By then the habit has repeated a hundred times. Coaching only changes behavior when it is close to the behavior. Quarterly and even monthly cycles are too slow to compound.

How to build a QA scorecard that works

A QA scorecard is the rubric you score every interaction against. A good one is short, weighted toward outcomes, and tied to what customers actually rate. Here is a structure that holds up. The percentages are weights, except compliance, which is a hard pass or fail.

Resolution & accuracy · 40%

Did the customer's actual problem get solved, correctly, on this contact? This is the heaviest weight because it is the strongest driver of satisfaction. First contact resolution belongs here.

Compliance · Pass or fail

Required disclosures, identity verification, regulated scripting, and data handling. These are non-negotiables. A miss is a fail on the whole call, not a deduction, because in regulated lanes a single compliance miss can cost far more than a low CSAT.

Communication & empathy · 25%

Tone, clarity, active listening, and whether the agent matched the customer's emotional state. This is where most checklists are thinnest and where customers form their judgment.

Process & efficiency · 20%

Did the agent follow the workflow, use the tools, and keep handle time reasonable without rushing the customer? Efficiency matters, but it is weighted below resolution on purpose. A fast call that does not solve the problem is not a good call.

Customer effort · 15%

How hard did the customer have to work? Repeated transfers, repeated explanations, and dead ends all raise effort and lower loyalty.

Two rules make this scorecard actually predictive. First, calibrate it against your post-call survey: if your QA scores and your CSAT scores keep disagreeing, your weights are wrong, not your customers. Second, keep the form short. A scorecard with forty line items gets filled out inconsistently and coached on never.

The fix the industry has been waiting for: score every call with AI

The sampling problem was never a philosophy problem. It was an economics problem. Listening to and scoring calls by hand is slow and expensive, so teams sampled four a month because that is what a human QA analyst could realistically get through.

That constraint is gone. AI-assisted QA can transcribe and score every call against your rubric in minutes, which changes the entire premise. Instead of judging an agent on four random calls, you judge the program on the full call volume. Outliers surface instead of hiding. Coaching points are real patterns, not one-off samples. And the feedback can reach the agent the same week rather than the next month. For the deeper technical view of how AI scoring, sentiment, and compliance detection actually work, see our breakdown of AI QA in the call center.

This is the model we run at Call Force Global. Every call our nearshore agents handle gets 100 percent AI-assisted quality assurance against the client's rubric, not a sampled few percent. The human stays in the loop on coaching and judgment, but the coverage is the whole call volume. For a buyer, that is the difference between hoping your far-offshore partner is consistent and seeing exactly how the program performs on every interaction. We deliver that as part of dedicated nearshore customer support outsourcing at $12 to $18 per agent hour, in native English with US business hour overlap. Our agents are fronters and support staff, not licensed advisors, so QA centers on service quality, resolution, and compliance scripting rather than licensed advice.

Tie QA back to the customer, not the checklist

The last move is the most important and the most ignored: bring the customer's own voice into the QA evaluation. The reason traditional scorecards drift away from CSAT is that they never look at the post-call survey alongside the scored call. When you overlay the two, you learn which scorecard items predict a happy customer and which are just noise, and you reweight accordingly. QA stops being an internal audit and becomes a customer-outcome model.

For context on where the bar sits: average US CSAT runs around 73 percent, a good score lands between 75 and 84 percent, and anything above 84 percent is genuinely excellent. About 80 percent of customer service organizations already use CSAT as their primary metric. If your QA scores are high while your CSAT sits at 73, your scorecard is measuring the wrong thing. Fix the rubric, not the customer. To see how CSAT sits alongside the other operating metrics, the Call Center KPI Benchmark dashboard shows AHT, FCR, ASA, CSAT, occupancy, and SLA bands by vertical.

QA reality check (SQM Group)

  • ~4 calls per agent per month is the traditional QA sampling standard, less than half of one percent of a thousand-call month.
  • 81 percent of traditional QA scores did not correlate with CSAT.
  • Only ~17 percent of agents believe QA scoring improves customer satisfaction. AI scoring on 100 percent of calls closes the sampling gap.

Source: SQM Group, Call Center Quality Monitoring Best Practices.

The takeaway

Call center quality assurance is not failing because teams do not care. It is failing because they inherited a method built around a constraint that no longer exists. Score more than four calls a month. Weight the rubric toward resolution, empathy, and customer effort instead of internal checkboxes. Calibrate against your real CSAT. And use AI to make 100 percent coverage the default rather than a luxury. Do that, and QA goes from a number nobody trusts to the most useful operating signal you have.

"The most expensive QA program is the one nobody believes. If your agents do not trust the score and it does not predict CSAT, you are paying for a number that changes nothing. Cover every call, weight toward the customer, and the score starts to earn its keep."

-- Miki Furman, Co-Founder & CTO at Call Force Global

Why Choose

Call Force Global

We run dedicated nearshore agent teams from the Caribbean and Latin America with 100 percent AI-assisted QA on every call, native English, and US business hour overlap, all bundled into one transparent rate.

100% AI QA

Every call scored against your rubric, not a sampled few

Your Time Zone

Native English on US business hours

$12 to $18 / hour

All-inclusive nearshore rate, QA included

Cite This Page

Writers and researchers are welcome to cite the figures on this page with attribution. SQM Group figures should be attributed to SQM Group; the QA scorecard framework and Call Force Global facts should be attributed to Call Force Global. Use the reference below.

APA

Furman, M. (2026). Call center quality assurance is broken: How to rebuild your QA scorecard. Call Force Global. Retrieved from https://callforce.global/blog/call-center-quality-assurance/

Sources

  • SQM Group. Call Center Quality Monitoring Best Practices: ~4 calls per agent per month QA sampling standard; 81 percent of traditional QA scores did not correlate with CSAT; ~17 percent of agents believe QA scoring improves CSAT. sqmgroup.com
  • Industry surveys: ~95 percent of call centers run QA monitoring; ~80 percent of customer service organizations use CSAT as their primary metric.
  • CSAT benchmark data: average US CSAT ~73 percent; good 75 to 84 percent; excellent 84 percent and above.

Frequently Asked Questions

What is call center QA?

Call center quality assurance (QA) is the practice of evaluating customer interactions against a defined standard, then using what you find to improve service quality, compliance, and customer satisfaction. In a healthy program QA is a feedback engine, not a compliance tax: it shows what is working on your calls, what is breaking, and where coaching will move the needle. About 95 percent of call centers run a QA program.

How many calls should QA score per agent?

The long-standing industry standard is to randomly score about 4 calls per agent per month, according to SQM Group. For an agent who handles a thousand calls a month, that is less than half of one percent of the work, which is too small a sample to be representative. With AI-assisted QA the practical answer is now 100 percent of calls, because transcribing and scoring every interaction against the rubric is no longer cost-prohibitive.

What is a QA scorecard?

A QA scorecard is the rubric you score every customer interaction against. A scorecard that holds up is short and weighted toward outcomes: resolution and accuracy (about 40 percent), compliance (pass or fail, not weighted), communication and empathy (about 25 percent), process and efficiency (about 20 percent), and customer effort (about 15 percent). Calibrate the weights against your post-call CSAT survey so the score predicts a happy customer rather than internal checkboxes.

Why don't QA scores match CSAT?

SQM Group research found that 81 percent of traditional QA scores did not correlate with customer satisfaction. The usual cause is that most QA forms are internal checklists (did the agent use the greeting, the closing script, the right ticket tag) that are easy to score but weakly connected to whether the customer left satisfied. When the scorecard rewards process compliance and the customer rewards resolution and empathy, the two scores drift apart. The fix is to overlay QA scores with post-call CSAT and reweight the rubric toward what actually predicts a satisfied customer.

What is a good CSAT score?

Average US CSAT runs around 73 percent, a good score lands between 75 and 84 percent, and anything above 84 percent is genuinely excellent. About 80 percent of customer service organizations already use CSAT as their primary metric. If your QA scores are high while your CSAT sits at 73 percent, the scorecard is measuring the wrong thing and the rubric, not the customer, needs to change.

Get updated

Subscribe to our newsletter & get the latest BPO insights

No spam, ever. Unsubscribe anytime.

Ready to get started?

Want QA on every call, not four a month?

Tell us about your operation and we will deliver a transparent, all-inclusive nearshore proposal within 24 hours, with 100 percent AI-assisted QA bundled in. No setup fees, no hidden costs.

100% AI QA 24-hour response All-inclusive pricing Live in 2-3 weeks