Last month I rebuilt a client’s entire lead scoring pipeline across Salesforce and HubSpot. Instead of writing everything from scratch, I ran the same tasks through three AI coding assistants—GitHub Copilot, Cursor, and Claude Code—and tracked which produced usable code, which hallucinated API endpoints, and which actually understood CRM-specific patterns. The results weren’t what I expected.

The Test Setup

I designed seven real-world CRM development tasks that I encounter on actual client projects. These aren’t toy problems—they’re the kind of work that eats up 60-70% of a CRM developer’s week.

The seven tasks:

  1. Salesforce Apex trigger for lead deduplication
  2. HubSpot custom workflow action via serverless function
  3. Dynamics 365 plugin for cascading field updates
  4. REST API integration between Salesforce and a marketing platform
  5. Bulk data migration script with error handling
  6. Custom Lightning Web Component for opportunity scoring
  7. HubSpot CRM card (new V3 format) for displaying external data

Each assistant got the same prompt, formatted identically. I tested GitHub Copilot (GPT-4.1 model, June 2026), Cursor (with Claude 3.5 Opus as the backend), and Claude Code (Claude 4 Sonnet, CLI version 1.2).

I scored each output on four criteria: compiles/runs without errors (pass/fail), follows platform best practices (1-10), handles edge cases (1-10), and time to production-ready (minutes of manual editing needed).

Task 1: Salesforce Apex Lead Deduplication Trigger

This is bread-and-butter Salesforce work. The prompt: “Write a before-insert Apex trigger that checks for duplicate leads based on email, normalizing email case and handling null values. If a duplicate exists, add an error to the record. Include a handler class and test class with at least 90% coverage.”

GitHub Copilot’s Output

Copilot produced a trigger, handler, and test class in about 8 seconds. The structure was clean:

trigger LeadDuplicateTrigger on Lead (before insert) {
    LeadDuplicateHandler.checkDuplicates(Trigger.new);
}

The handler correctly bulkified the query by collecting emails into a Set before querying. Good. But it missed something I see junior devs miss all the time—it didn’t account for leads being inserted in the same batch that have the same email as each other. It only checked against existing records in the database.

The test class hit 87% coverage, not 90%. Close, but that would fail deployment in most orgs.

Time to production-ready: 18 minutes.

Cursor’s Output

Cursor took a different approach. It asked a clarifying question in the chat (which I appreciated): “Should duplicates within the same batch also be flagged?” I said yes.

The resulting code handled both scenarios—duplicates against existing records AND within the same batch:

public static void checkDuplicates(List<Lead> newLeads) {
    Set<String> newEmails = new Set<String>();
    Map<String, Lead> emailToLeadMap = new Map<String, Lead>();
    
    for (Lead l : newLeads) {
        if (l.Email != null) {
            String normalizedEmail = l.Email.toLowerCase().trim();
            if (emailToLeadMap.containsKey(normalizedEmail)) {
                l.addError('Duplicate email found within this batch: ' + l.Email);
            } else {
                emailToLeadMap.put(normalizedEmail, l);
                newEmails.add(normalizedEmail);
            }
        }
    }
    
    List<Lead> existingLeads = [
        SELECT Id, Email FROM Lead 
        WHERE Email IN :newEmails
    ];
    
    Set<String> existingEmails = new Set<String>();
    for (Lead existing : existingLeads) {
        existingEmails.add(existing.Email.toLowerCase().trim());
    }
    
    for (Lead l : newLeads) {
        if (l.Email != null && existingEmails.contains(l.Email.toLowerCase().trim())) {
            l.addError('A lead with this email already exists.');
        }
    }
}

Test class hit 94% coverage on the first try. One minor issue: it didn’t use System.assert with messages, just bare assertions—a style thing, but it matters for debugging failed tests.

Time to production-ready: 7 minutes.

Claude Code’s Output

Claude Code generated the most complete solution. It included the trigger, handler, test class, AND a utility class for email normalization that handled edge cases like plus-addressing ([email protected]) and dots in Gmail addresses. Honestly, I didn’t ask for that—but it’s the kind of thing that burns you six months later when marketing starts using email aliases.

The test class hit 96% coverage and included negative tests for null emails, empty strings, and batch-duplicate scenarios.

One problem: it used @isTest(SeeAllData=true) in one test method. That’s an anti-pattern in Apex testing. Easy fix, but surprising that it made that choice.

Time to production-ready: 5 minutes.

Winner for this task: Claude Code. The extra email normalization logic showed genuine understanding of the business problem, not just the technical requirement.

Task 2: HubSpot Serverless Custom Action

The prompt: “Create a HubSpot serverless function for a custom workflow action that enriches a contact with company data from Clearbit’s API. Handle rate limiting, API failures, and log errors to a HubSpot custom property.”

The Results Were More Mixed Here

Copilot wrote a functional Node.js function but used the HubSpot V2 API endpoints. Those still work, but HubSpot’s been pushing V3 since 2024, and V2 will eventually get deprecated. The Clearbit integration was solid.

Cursor produced V3 API calls but hallucinated a Clearbit endpoint that doesn’t exist (/v3/companies/enrich—the actual path is /v2/companies/find). This is the kind of error that wastes 45 minutes of debugging because the code looks right.

Claude Code nailed the HubSpot V3 API usage and got the Clearbit endpoint correct. It also included retry logic with exponential backoff:

const enrichContact = async (contactId, email, retries = 3) => {
  for (let attempt = 1; attempt <= retries; attempt++) {
    try {
      const clearbitResponse = await axios.get(
        `https://company.clearbit.com/v2/companies/find`,
        {
          params: { domain: email.split('@')[1] },
          headers: { Authorization: `Bearer ${process.env.CLEARBIT_API_KEY}` },
          timeout: 5000
        }
      );
      
      await hubspotClient.crm.contacts.basicApi.update(contactId, {
        properties: {
          company_enrichment_status: 'success',
          company_size: clearbitResponse.data.metrics?.employees || '',
          company_industry: clearbitResponse.data.category?.industry || '',
          enrichment_date: new Date().toISOString()
        }
      });
      
      return { success: true };
    } catch (error) {
      if (error.response?.status === 429 && attempt < retries) {
        const delay = Math.pow(2, attempt) * 1000;
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }
      
      await hubspotClient.crm.contacts.basicApi.update(contactId, {
        properties: {
          company_enrichment_status: `failed: ${error.message}`
        }
      });
      
      throw error;
    }
  }
};

Winner: Claude Code again, but Copilot was a close second once you swapped to V3 endpoints. Cursor’s hallucinated API path was a dealbreaker—that kind of confident incorrectness is worse than no output at all.

Task 4: REST API Integration Between Salesforce and a Marketing Platform

I’m skipping to Task 4 because it revealed the most interesting differences. The prompt: “Write a Salesforce Apex class that syncs closed-won opportunities to a marketing automation platform via REST API. Include OAuth 2.0 authentication with token refresh, batch processing for governor limits, and a schedulable class that runs every 4 hours.”

Where Copilot Excels

Copilot understood Salesforce governor limits intuitively. It broke the callouts into batches of 100 (staying well under the 100-callout limit per transaction) and used Database.AllowsCallouts correctly. The OAuth flow was textbook—stored the refresh token in a Custom Setting, handled token expiry.

But it used synchronous callouts inside the schedulable class. That’s a violation of Salesforce’s scheduling rules—you can’t make callouts directly from Schedulable.execute(). You need to chain to a @future method or Queueable.

Where Cursor Got Creative

Cursor used a Queueable chain pattern—each batch processes 50 records, then chains to another Queueable for the next batch. This is actually the preferred pattern in 2026 because of the depth limit increase to 10 in recent Salesforce releases. The OAuth implementation used Named Credentials, which is the modern best practice:

public class OpportunitySyncQueueable implements Queueable, Database.AllowsCallouts {
    private List<Id> opportunityIds;
    private Integer batchIndex;
    private static final Integer BATCH_SIZE = 50;
    
    public OpportunitySyncQueueable(List<Id> oppIds, Integer idx) {
        this.opportunityIds = oppIds;
        this.batchIndex = idx;
    }
    
    public void execute(QueueableContext context) {
        Integer startIdx = batchIndex * BATCH_SIZE;
        Integer endIdx = Math.min(startIdx + BATCH_SIZE, opportunityIds.size());
        
        List<Id> currentBatch = new List<Id>();
        for (Integer i = startIdx; i < endIdx; i++) {
            currentBatch.add(opportunityIds[i]);
        }
        
        List<Opportunity> opps = [
            SELECT Id, Name, Amount, CloseDate, Account.Name
            FROM Opportunity
            WHERE Id IN :currentBatch
        ];
        
        HttpRequest req = new HttpRequest();
        req.setEndpoint('callout:Marketing_Platform/api/v2/deals/batch');
        req.setMethod('POST');
        req.setHeader('Content-Type', 'application/json');
        req.setBody(JSON.serialize(mapToPayload(opps)));
        
        Http http = new Http();
        HttpResponse res = http.send(req);
        
        if (res.getStatusCode() != 200) {
            logError(res, currentBatch);
        }
        
        // Chain next batch
        if (endIdx < opportunityIds.size()) {
            System.enqueueJob(
                new OpportunitySyncQueueable(opportunityIds, batchIndex + 1)
            );
        }
    }
}

Where Claude Code Went Deep

Claude Code produced the most production-ready solution. Beyond the basic sync, it included:

  • A SyncLog__c custom object design for tracking each sync attempt
  • Error categorization (authentication failure vs. data validation vs. timeout)
  • A circuit breaker pattern—if 5 consecutive requests fail, it stops and creates a case for the admin
  • Idempotency keys to prevent duplicate pushes on retry

The code was about 40% longer than the other two outputs. In a real project, that’s both a strength (more complete) and a risk (more surface area for bugs). The circuit breaker logic was something I’d actually use in production—I’ve seen runaway sync jobs burn through API limits dozens of times.

Winner: Cursor for architecture, Claude Code for completeness. If I had to ship one of these today, I’d take Cursor’s Queueable chain and add Claude Code’s error handling and circuit breaker.

Task 6: Lightning Web Component for Opportunity Scoring

This task exposed the biggest weakness across all three tools. The prompt: “Build a Lightning Web Component that displays an AI-generated opportunity score with a visual gauge, pulls scoring data from a custom Apex controller, and updates in real-time when opportunity fields change via Lightning Data Service.”

All Three Struggled with LWC

LWC development is where AI coding assistants hit a wall. The component framework has specific patterns—decorators like @wire, @api, @track—that all three tools understood. But the interplay between Lightning Data Service, imperative Apex calls, and reactive properties tripped them up.

Copilot used @track decorators which are no longer needed for reactive properties in current LWC (they haven’t been since 2023). It also imported getRecord from the wrong module.

Cursor produced the cleanest HTML template with a proper SVG gauge component. But it wired the Apex method incorrectly—using @wire with dynamic parameters that would cause infinite re-renders.

Claude Code got the wire adapter right and included proper error boundaries. But its CSS was… ambitious. It tried to create a gradient-filled circular gauge with pure CSS, and the result looked like a 2015 dashboard widget. Frontend design isn’t its strength.

Here’s what actually worked from Claude Code’s controller:

public with sharing class OpportunityScoreController {
    @AuraEnabled(cacheable=true)
    public static OpportunityScoreResult getScore(Id opportunityId) {
        Opportunity opp = [
            SELECT Id, Amount, StageName, Probability, 
                   CloseDate, Account.Industry,
                   (SELECT Id, Status FROM Tasks WHERE Status != 'Completed'),
                   (SELECT Id, Subject FROM Events WHERE StartDateTime >= :Date.today())
            FROM Opportunity 
            WHERE Id = :opportunityId
            LIMIT 1
        ];
        
        OpportunityScoreResult result = new OpportunityScoreResult();
        result.baseScore = calculateBaseScore(opp);
        result.engagementMultiplier = calculateEngagement(opp);
        result.finalScore = Math.min(100, 
            (Integer)(result.baseScore * result.engagementMultiplier));
        result.factors = getScoreFactors(opp);
        
        return result;
    }
    
    public class OpportunityScoreResult {
        @AuraEnabled public Integer baseScore;
        @AuraEnabled public Decimal engagementMultiplier;
        @AuraEnabled public Integer finalScore;
        @AuraEnabled public List<ScoreFactor> factors;
    }
    
    public class ScoreFactor {
        @AuraEnabled public String label;
        @AuraEnabled public String impact; // positive, negative, neutral
        @AuraEnabled public String detail;
    }
}

Winner: No clear winner. I’d take Claude Code’s Apex controller, Cursor’s HTML template, and write the CSS myself. None of them produced a ship-ready LWC. This matches my experience—AI assistants handle backend CRM logic much better than frontend component frameworks.

Task 7: HubSpot CRM Card (V3)

This was my gut-check test. HubSpot released the V3 CRM card format in late 2025, and training data cutoffs matter here. The prompt: “Build a HubSpot CRM card using the V3 UI extensions framework that displays customer health scores from an external API, with action buttons for creating support tickets.”

Copilot and Cursor both produced V2 CRM card JSON—the old results format with objectId parameters. Functional but deprecated.

Claude Code produced a V3 UI extension using React with the HubSpot CLI project structure. It correctly used hubspot.fetch for authenticated API calls and the @hubspot/ui-extensions SDK components. It got about 80% of the way there but used a Table component variant that doesn’t exist in the current SDK.

Winner: Claude Code by default. It was the only one that even attempted the V3 format. But I still spent 25 minutes fixing component imports. If you’re doing HubSpot custom card development, you’re still going to spend a lot of time in the docs regardless of which assistant you use.

The Scoreboard

TaskCopilotCursorClaude Code
Apex Lead Dedup7/108.5/109/10
HubSpot Serverless7/105/10*9/10
Dynamics 365 Plugin8/107/107.5/10
REST API Integration6/108.5/108/10
Bulk Migration Script7.5/107/108.5/10
LWC Component5/106/106.5/10
HubSpot CRM Card V33/103/106/10

*Cursor’s score on Task 2 reflects the hallucinated API endpoint, which is worse than a partial solution.

Overall averages: Copilot 6.2, Cursor 6.4, Claude Code 7.8.

What This Actually Means for Your Workflow

Use Claude Code When…

You’re writing backend CRM logic—Apex classes, integration middleware, data processing scripts. It consistently produced the most complete solutions with proper error handling. It also had the best understanding of CRM-specific patterns like bulkification, governor limits, and OAuth token management.

The CLI-based workflow is less intuitive than Copilot’s inline suggestions, but for substantial pieces of code, the extra context you can provide through Claude Code’s project-aware features makes a real difference.

Use Cursor When…

You’re doing architecture-level work—designing integration patterns, building class hierarchies, planning batch processing chains. Cursor’s ability to ask clarifying questions (when configured to do so) led to better architectural decisions. The Queueable chain pattern it produced for Task 4 was genuinely the best approach.

It’s also my pick for refactoring existing CRM code. Paste in a messy trigger and ask for a restructured handler pattern—Cursor consistently produces cleaner refactors than the other two.

Use Copilot When…

You’re writing quick utilities, test classes, or small methods where inline autocomplete is more valuable than a full-file generation. Copilot’s speed advantage is real—it delivers suggestions in 1-3 seconds versus 5-15 seconds for the others. For repetitive CRM tasks like writing SOQL queries or mapping fields, that speed compounds.

It’s also the safest choice for Dynamics 365 development. Copilot scored highest on the Dynamics plugin task, likely because Microsoft’s own platform data feeds its training.

Common Patterns Where All Three Fail

After running these tests, three consistent failure modes stood out:

1. Platform-specific deployment configuration. None of them reliably generated correct sfdx-project.json, serverless.yml, or package.xml files. You’ll always need to handle deployment configs manually.

2. Complex SOSL queries. SOQL was fine across the board. SOSL (Salesforce Object Search Language) was consistently mangled—wrong syntax, invalid field lists, misused RETURNING clauses.

3. Multi-object relationship traversal beyond two levels. Ask for a query that traverses Account → Opportunity → OpportunityLineItem → Product2 → PricebookEntry and all three either hit a relationship limit they made up or generated invalid relationship paths.

My Actual Setup for CRM Development

I’ve settled on a hybrid workflow. Cursor is my primary editor for CRM projects. I use its built-in AI for architecture discussions and refactoring. For complex business logic—the kind where edge cases actually matter—I switch to Claude Code in a separate terminal and give it the full context of what I’m building.

GitHub Copilot stays active as the autocomplete layer inside Cursor (yes, you can run both). It handles the small stuff—finishing variable names, suggesting SOQL WHERE clauses, writing assert statements in test classes.

This combination has cut my CRM development time by roughly 35-40% on integration-heavy projects. The percentage is lower (maybe 15-20%) for frontend LWC work, where I still write most of the code myself.

What to Do Next

Pick one of these assistants and try it on your next CRM development task—something real, not a tutorial exercise. Compare its output against code you’ve already written. You’ll quickly see where it accelerates your work and where it slows you down with plausible-looking bugs.

If you’re evaluating CRM platforms alongside these tools, check out our Salesforce vs HubSpot comparison or browse AI-powered CRM tools to see how these assistants fit into the broader ecosystem.


Disclosure: Some links on this page are affiliate links. We may earn a commission if you make a purchase, at no extra cost to you. This helps us keep the site running and produce quality content.