Role Definition
| Field | Value |
|---|---|
| Job Title | Computer Vision Engineer |
| Seniority Level | Mid-Level |
| Primary Function | Develops image and video analysis systems — object detection, segmentation, tracking, 3D reconstruction, visual SLAM. Builds perception pipelines for autonomous vehicles, manufacturing quality inspection, medical imaging, and AR/VR. Works with OpenCV, PyTorch/TensorFlow, CUDA, edge deployment frameworks (TensorRT, ONNX), and model optimisation techniques. |
| What This Role Is NOT | NOT an ML/AI Engineer (general ML — scored 68.2 Green Accelerated). NOT a Data Scientist (analysis — scored 19.0 Red). NOT a Robotics Software Engineer (full robot stack including motion planning, control systems). This is focused specifically on visual perception systems. |
| Typical Experience | 3-6 years. Typically holds an MS or PhD in computer science or EE with a vision/perception focus. Fluent in PyTorch, OpenCV, CUDA. May hold NVIDIA Deep Learning certifications. |
Seniority note: Junior CV engineers (0-2 years) who primarily run existing model training scripts and label data would score Yellow (Urgent) — foundation models automate their entry-level work fastest. Senior/Principal perception architects who define system-level perception strategy for safety-critical deployments would score deeper Green (Stable).
Protective Principles + AI Growth Correlation
| Principle | Score (0-3) | Rationale |
|---|---|---|
| Embodied Physicality | 0 | Fully digital, desk-based. Some CV engineers work with physical sensor rigs, but the vast majority of work occurs in code, cloud compute, and simulation environments. |
| Deep Interpersonal Connection | 0 | Primarily technical. Collaborates with robotics, product, and domain teams, but the core value is perception system output, not relationships. |
| Goal-Setting & Moral Judgment | 2 | Makes consequential decisions about perception architecture, model selection, and safety trade-offs. Determines what the system "sees" and how it interprets the world — errors in autonomous vehicles or medical imaging can be life-threatening. Significant technical judgment, but within established frameworks. |
| Protective Total | 2/9 | |
| AI Growth Correlation | 1 | More AI deployment drives more demand for perception systems — autonomous vehicles, smart manufacturing, medical imaging all grow with AI adoption. However, the relationship is not recursive in the way ML/AI Engineer is (CV engineers use AI but don't build the foundational AI that creates demand for more AI). Weak positive, not strong positive. |
Quick screen result: Protective 2 + Correlation 1 = Likely Yellow or borderline Green. Proceed to quantify — task resistance and evidence will determine the final zone.
Task Decomposition (Agentic AI Scoring)
| Task | Time % | Score (1-5) | Weighted | Aug/Disp | Rationale |
|---|---|---|---|---|---|
| Perception pipeline development (object detection, segmentation, tracking) | 25% | 2 | 0.50 | AUGMENTATION | Q2: Foundation models (SAM, DINO, CLIP) handle generic segmentation and detection well. But custom perception for safety-critical AV, manufacturing defect detection with sub-pixel accuracy, and real-time multi-object tracking in novel environments requires human-designed architectures, domain-specific tuning, and system-level integration AI cannot do autonomously. |
| Model training, evaluation, and experimentation | 20% | 3 | 0.60 | AUGMENTATION | Q2: AutoML, NAS, and hyperparameter optimisation tools automate significant portions of the training loop. AI handles experiment tracking and standard model selection. Human leads on designing evaluation protocols for safety-critical systems, interpreting failure modes, and selecting architectures for novel domains. |
| Edge deployment and model optimisation (ONNX, TensorRT, quantisation, pruning) | 15% | 2 | 0.30 | AUGMENTATION | Q2: AI assists with standard quantisation and compilation. But optimising a perception model to run at 30fps on an embedded GPU with 4W power budget while maintaining safety-critical accuracy thresholds requires deep hardware-software understanding. Edge deployment is a differentiating moat. |
| Data pipeline and annotation management | 10% | 4 | 0.40 | DISPLACEMENT | Q1: Yes — AI performs annotation (SAM, auto-labelling), data augmentation, dataset curation, and quality checks. Foundation models have largely automated the annotation bottleneck that consumed CV engineer time. Human reviews edge cases and validates quality but the bulk is AI-executed. |
| Sensor integration and calibration (camera, LiDAR, depth sensors) | 10% | 2 | 0.20 | AUGMENTATION | Q2: AI assists with auto-calibration algorithms. Human designs multi-sensor fusion architectures, handles intrinsic/extrinsic calibration, resolves sensor-specific artefacts, and validates in physical environments. This is closer to embedded systems work — hardware-adjacent. |
| 3D reconstruction, visual SLAM, multi-view geometry | 10% | 2 | 0.20 | NOT INVOLVED | Classical CV and geometric reasoning — not well-suited to AI automation. Multi-view geometry, structure-from-motion, and SLAM algorithms require mathematical rigour and domain-specific customisation that AI agents cannot reliably produce. Genuine novelty in each deployment environment. |
| Documentation, code review, cross-functional collaboration | 10% | 3 | 0.30 | AUGMENTATION | Q2: AI generates documentation, assists with code review, and drafts technical specifications. Human reviews for accuracy, leads design discussions, and communicates perception system capabilities and limitations to cross-functional teams. |
| Total | 100% | 2.50 |
Task Resistance Score: 6.00 - 2.50 = 3.50/5.0
Displacement/Augmentation split: 10% displacement, 80% augmentation, 10% not involved.
Reinstatement check (Acemoglu): Yes. AI creates new CV engineering tasks: deploying and fine-tuning foundation vision models (SAM, DINO, Florence) for domain-specific applications, building multi-modal perception systems that fuse vision with LLM reasoning, developing AI-powered annotation validation pipelines, and designing perception systems for novel AI-driven products (AR glasses, autonomous delivery, surgical robotics). The task portfolio is expanding, not shrinking.
Evidence Score
| Dimension | Score (-2 to 2) | Evidence |
|---|---|---|
| Job Posting Trends | 1 | AI/ML postings surged 163% YoY to 49,200 in 2025 (Lightcast). CV engineer postings are a subset of this growth. LinkedIn lists CV engineer among the top 12 in-demand AI roles for 2026. Motion Recruitment flags CV engineer as an "emerging top earner." Growth is strong but not as explosive as general ML engineering — CV is a specialisation within the broader AI hiring wave. |
| Company Actions | 1 | Waymo, Cruise, Tesla, WeRide actively hiring perception engineers — WeRide surpassed 1,000 autonomous taxis in January 2026. AV sensor market projected to grow at 60.4% CAGR 2025-2030 (Technavio). No reports of companies cutting CV teams. However, hiring is concentrated in AV, manufacturing, and medical imaging — not as broad-based as general ML hiring. |
| Wage Trends | 1 | Glassdoor: $163,741 average total pay for CV engineers. Coursera/Payscale: $115K-$132K base. Hired.com: $100K-$250K range, $150K median. FAANG CV roles: Meta $214K-$335K, Google $235K-$373K, Cruise $259K-$407K. Mid-level premium above general SWE ($133K) but below pure ML engineer ($187K). Growing with the market but not surging ahead of it. |
| AI Tool Maturity | 0 | Foundation models (SAM, SAM 2, DINOv2, CLIP, Florence) have democratised basic CV tasks — zero-shot segmentation, open-vocabulary detection, image classification. This is a double-edged sword: it reduces demand for engineers doing basic CV but increases demand for engineers integrating these models into production systems. Auto-labelling tools (Roboflow, V7, Encord) automate annotation. Custom perception for safety-critical and edge applications remains beyond automated tools. Scored 0 — tools are powerful but create as much work as they displace. |
| Expert Consensus | 1 | Leland's Top 20 AI Careers: "Demand for computer vision engineers is growing rapidly, especially in autonomous driving, security, and healthcare." OpenCV 2025 career goals highlight growing complexity of the role. Reddit r/computervision: consensus that CV engineering requires deep mathematical and systems knowledge beyond what AI tools provide. However, some experts note that foundation models are compressing the skill gap — "you can be productive in CV without deep ML knowledge now." Mixed-to-positive, not unanimously bullish. |
| Total | 4 |
Barrier Assessment
Reframed question: What prevents AI execution even when programmatically possible?
| Barrier | Score (0-2) | Rationale |
|---|---|---|
| Regulatory/Licensing | 1 | No formal licensing for CV engineers. But safety-critical applications (autonomous vehicles under UNECE WP.29, medical imaging under FDA 510(k)/De Novo, EU AI Act high-risk classification for AV perception) require human accountability for perception system performance. Regulatory barriers are domain-dependent — consumer CV has minimal oversight. |
| Physical Presence | 0 | Fully remote capable. Some sensor calibration and edge deployment work benefits from physical access, but the vast majority is desk-based. |
| Union/Collective Bargaining | 0 | Tech sector, at-will employment. No union protection. |
| Liability/Accountability | 1 | Perception system failures in autonomous vehicles, medical imaging, and manufacturing inspection can cause physical harm or death. A human must be accountable for what the system "sees" and doesn't see. NHTSA investigations into AV crashes routinely examine perception pipeline decisions. However, liability typically falls on the company and system architect, not the mid-level engineer. |
| Cultural/Ethical | 0 | Society is broadly embracing CV in surveillance, healthcare, manufacturing, and autonomous systems. Some privacy/bias concerns around facial recognition, but these affect deployment decisions, not the demand for CV engineers. |
| Total | 2/10 |
AI Growth Correlation Check
Confirmed at +1 (Weak Positive). Computer vision demand grows with AI adoption — more AI means more autonomous vehicles, more smart manufacturing, more medical imaging, more AR/VR. The CV market is projected to grow from $20B to $58-73B by 2030 (19.8% CAGR, Grand View Research). However, the relationship is not recursive: CV engineers build perception systems that use AI, but they don't build the foundational AI models themselves. The ML/AI Engineer (scored +2) builds the models that CV engineers deploy. CV engineering benefits from AI growth but doesn't directly feed it. Not Accelerated Green.
JobZone Composite Score (AIJRI)
| Input | Value |
|---|---|
| Task Resistance Score | 3.50/5.0 |
| Evidence Modifier | 1.0 + (4 × 0.04) = 1.16 |
| Barrier Modifier | 1.0 + (2 × 0.02) = 1.04 |
| Growth Modifier | 1.0 + (1 × 0.05) = 1.05 |
Raw: 3.50 × 1.16 × 1.04 × 1.05 = 4.4335
JobZone Score: (4.4335 - 0.54) / 7.93 × 100 = 49.1/100
Zone: GREEN (Green ≥48, Yellow 25-47, Red <25)
Sub-Label Determination
| Metric | Value |
|---|---|
| % of task time scoring 3+ | 40% |
| AI Growth Correlation | 1 |
| Sub-label | Green (Transforming) — AIJRI ≥ 48 AND ≥20% task time scores 3+ |
Assessor override: None — formula score accepted. The 49.1 sits just 1.1 points above the Green threshold. This is an honest borderline score. The evidence (+4) and growth (+1) provide the margin. An override is not warranted — the score accurately reflects a role that is protected by specialist expertise but under real pressure from foundation models.
Assessor Commentary
Score vs Reality Check
The 49.1 JobZone Score sits 1.1 points above the Green/Yellow boundary. This is a genuine borderline case — the thinnest Green in the Data & AI domain. The score is honest: foundation vision models (SAM, DINO, CLIP) are rapidly democratising tasks that were specialist CV engineering work two years ago. The Green classification depends entirely on the continuing demand for custom perception systems in safety-critical domains (AV, medical, manufacturing) and the edge deployment moat. If foundation models advance to handle domain-specific safety-critical perception end-to-end, this role drops to Yellow within 2-3 years. Compare to ML/AI Engineer (68.2) — the gap is 19.1 points, reflecting that general ML engineering has stronger demand growth, higher wages, and recursive AI correlation that CV engineering lacks.
What the Numbers Don't Capture
- Foundation model compression trajectory. SAM, SAM 2, DINOv2, and Florence-2 are advancing rapidly. Each generation handles more complex CV tasks out-of-the-box. The "custom perception pipeline" moat is eroding year by year. Tasks scored 2 today (perception pipeline development, sensor integration) could shift to 3 within 2-3 years as foundation models improve, which would push the score below the Green threshold.
- Domain bifurcation. Safety-critical CV engineering (autonomous vehicles, medical imaging, aerospace) is substantially more resistant than the 3.50 average suggests — regulatory barriers, liability, and domain expertise create a deep moat. Consumer/enterprise CV (social media filters, retail analytics, content moderation) is closer to Yellow, with foundation models handling most use cases via API calls.
- Supply shortage confound. The positive evidence signals are partly driven by an acute shortage of CV engineers with production AV experience. AV companies are competing for a small talent pool. If AV investment cycles or foundation model capabilities shift, these premiums could compress quickly.
- Title rotation risk. "Computer Vision Engineer" may be absorbed into "ML Engineer" or "Perception Engineer" as the boundaries blur. The work persists but the specialist title may not — reducing the salary premium that specialist positioning currently provides.
Who Should Worry (and Who Shouldn't)
If your daily work involves building custom perception systems for safety-critical applications — autonomous vehicle perception stacks, medical imaging pipelines requiring FDA clearance, manufacturing quality inspection with sub-pixel accuracy, or edge deployment on power-constrained hardware — you are more protected than the 49.1 score suggests. These domains have regulatory barriers, domain-specific expertise requirements, and physical-world constraints that foundation models cannot address.
If you primarily train standard object detection models, run image classification experiments, or build CV pipelines that could be replaced by foundation model API calls — you are closer to Yellow (Urgent). Foundation models like SAM and DINO already handle basic segmentation and detection better than most custom-trained models, and the gap is widening.
The single biggest separator: whether your perception systems must work in the physical world under safety constraints. The CV engineer deploying perception on an autonomous vehicle at 70mph in rain has a fundamentally different risk profile from one building a product image classifier. Same job title, different futures.
What This Means
The role in 2028: The mid-level CV engineer of 2028 spends less time training custom detection and segmentation models from scratch — foundation models handle the baseline. Instead, they focus on fine-tuning vision foundation models for domain-specific applications, building multi-sensor fusion systems, optimising perception for edge hardware, and ensuring safety-critical performance under adversarial conditions. The role shifts from "build a detector" to "integrate, optimise, and validate perception systems." Teams get leaner as foundation models reduce boilerplate, but growing demand from AV, medical imaging, AR/VR, and industrial automation absorbs the productivity gains.
Survival strategy:
- Specialise in safety-critical perception domains. AV, medical imaging, or aerospace CV work carries regulatory barriers and domain expertise that foundation models cannot replicate. Generic CV skills are commoditising — domain expertise is the moat.
- Master edge deployment and model optimisation. TensorRT, ONNX, quantisation, pruning, and deploying models on power-constrained embedded hardware is a differentiating skill that AI tools handle poorly. This is the embedded systems equivalent for CV engineers.
- Learn to orchestrate foundation vision models. SAM, DINO, CLIP, and Florence are tools, not threats, if you know how to fine-tune, combine, and deploy them for production use cases. The CV engineer who treats foundation models as building blocks rather than competitors is the one who thrives.
Where to look next. If you're considering a career shift, these Green Zone roles share transferable skills with computer vision engineering:
- ML/AI Engineer (AIJRI 68.2) — your PyTorch, model training, and deployment skills transfer directly; broaden from vision to general ML systems
- AI Security Engineer (AIJRI 79.3) — adversarial ML, model robustness, and AI safety testing leverage core CV research skills
- Embedded Systems Developer (AIJRI 56.8) — edge deployment, CUDA, and hardware-constrained optimisation skills translate to firmware and embedded systems work
Browse all scored roles at jobzonerisk.com to find the right fit for your skills and interests.
Timeline: 3-5 years for significant daily workflow transformation as foundation vision models mature. No displacement timeline for safety-critical perception specialists — the regulatory and physical-world constraints extend protection 10+ years. Generic CV work faces compression within 2-3 years.