Skip to content

Classifying global state preparation via deep reinforcement learning

MetadataDetails
Publication Date2020-11-05
JournalMachine Learning Science and Technology
AuthorsTobias Haug, Wai-Keong Mok, Jia-Bin You, Wenzu Zhang, Ching Eng Png
Citations35
AnalysisFull AI Review Included

Technical Documentation & Analysis: Deep Reinforcement Learning for NV Center Control

Section titled “Technical Documentation & Analysis: Deep Reinforcement Learning for NV Center Control”

This document analyzes the research paper “Classifying global state preparation via deep reinforcement learning” (arXiv:2005.12759v1) and connects its material requirements to 6CCVD’s advanced MPCVD diamond capabilities, focusing on Single Crystal Diamond (SCD) substrates essential for Nitrogen-Vacancy (NV) center research.


  • Core Achievement: Demonstration of global quantum control using Deep Reinforcement Learning (DRL) to generate protocols for preparing arbitrary superposition states in multi-level Nitrogen-Vacancy (NV) centers.
  • Performance Metrics: Achieved high mean fidelity (F > 0.972) for state preparation across the continuous two-dimensional Bloch sphere subspace.
  • Speed Breakthrough: Protocols achieved state preparation in approximately $T \approx 0.5$ ns using only 9 timesteps, significantly faster than conventional adiabatic methods (e.g., STIRAP) and reducing the impact of dissipation.
  • Methodological Insight: The DRL approach, utilizing Proximal Policy Optimization (PPO), automatically clusters near-optimal protocols into distinct “phases,” providing physical insights into optimal preparation timescales and constraints.
  • Material Requirement: The success of this quantum control relies fundamentally on high-purity, low-strain Single Crystal Diamond (SCD) to host the NV centers and maintain long electron spin coherence times ($T_2$).
  • 6CCVD Value Proposition: 6CCVD provides the necessary Optical Grade SCD substrates, custom dimensions, and advanced polishing ($R_a < 1$ nm) required to replicate and scale this high-speed quantum control research.

The following hard data points were extracted from the experimental results and simulation parameters detailed in the paper:

ParameterValueUnitContext
Mean Fidelity (F)0.972N/AOptimized protocol (closed system approximation)
Protocol Duration (T)~0.5nsTime required for arbitrary state preparation
Maximum Protocol Time0.8nsUpper bound for variable time per step
Number of Timesteps ($N_T$)9N/AOptimized discrete steps in the protocol
Driving Strength Range ($\Omega_{1,2}$)±20GHzRange of applied laser Rabi frequencies
Detuning ($\delta_1$)50GHzRelative detuning of the first driving laser
External Magnetic Field ($B_{ext}$)0.15TApplied along the NV quantization axis to lift degeneracy
NV Ground State Splitting ($D_{gs}$)$2\pi \times 2.88$GHzZero-field splitting of the spin-1 triplet
Dissipation Timescale$\gg 13$nsProtocol time must be much faster than this limit
Neural Network Neurons ($N_H$)600N/AUsed in two fully-connected hidden layers

The experiment utilized a sophisticated quantum control simulation driven by Deep Reinforcement Learning (DRL).

  1. System Modeling: The physical system is a multi-level Nitrogen-Vacancy (NV) center, modeled as a 10-level system (3 ground states, 6 excited states, 1 metastable state). For the fast control regime ($T \ll 13$ ns), the system is approximated as an effective closed 8-level system.
  2. Control Mechanism: Coherent control between the $|-1\rangle$ and $|+1\rangle$ triplet ground states is achieved by applying two time-dependent driving lasers ($\Omega_1(t), \Omega_2(t)$) that couple the ground states indirectly via the excited state manifold.
  3. Protocol Structure: The control protocol $\beta(t)$ is a piece-wise constant function defined by $N_T=9$ steps, where each step determines the driving strengths ($\Omega_1^{(k)}, \Omega_2^{(k)}$) and the timestep length ($\Delta t^{(k)}$).
  4. Optimization Algorithm: Deep Reinforcement Learning (DRL) was implemented using the Actor-Critic method with Proximal Policy Optimization (PPO).
  5. Training Strategy: The neural network was trained over 800,000 epochs using randomly sampled target states $\Psi_{target}(\theta, \phi)$. The sampling was biased towards areas of lower fidelity to ensure global convergence and prevent the algorithm from getting stuck in local minima.
  6. Reward Maximization: The goal of the training was to maximize the fidelity $F = |\langle\Psi(t_{N_T})|\Psi_{target}\rangle|^2$ as the reward function.

The successful implementation of high-speed quantum control in NV centers is critically dependent on the quality of the host diamond material. 6CCVD specializes in providing the high-specification MPCVD diamond required for cutting-edge quantum research.

To replicate and advance this research, the primary material requirement is high-purity, low-strain Single Crystal Diamond (SCD).

Material Specification6CCVD Material GradeRelevance to NV Center Research
NV Host SubstrateOptical Grade Single Crystal Diamond (SCD)Essential for hosting NV centers. Our SCD features ultra-low nitrogen content (< 1 ppb), maximizing electron spin coherence time ($T_2$).
Doping OptionsControlled Nitrogen DopingAllows for precise control over NV concentration and depth, crucial for optimizing optical coupling and minimizing surface effects.
Alternative SubstratesHigh-Purity Polycrystalline Diamond (PCD)While SCD is preferred for coherence, high-quality PCD (up to 125mm) can be used for large-area sensor arrays or structural components where coherence requirements are less stringent.

6CCVD’s in-house capabilities directly address the complex engineering needs of quantum device fabrication, offering flexibility far beyond standard commercial wafers.

Research Requirement6CCVD Customization CapabilityBenefit to Quantum Engineers
Specific Dimensions/ShapesCustom dimensions for plates/wafers up to 125mm (PCD) and large-area SCD substrates (up to 10mm thickness).Supports integration into complex optical setups and scaling up device prototypes.
Thin Film NV LayersSCD growth control from 0.1 ”m up to 500 ”m thickness.Enables creation of shallow NV layers for enhanced coupling to external fields or deep layers for bulk sensing applications.
Integrated Control StructuresInternal metalization services: Au, Pt, Pd, Ti, W, Cu.Allows for direct integration of on-chip microwave antennas or electrodes, necessary for hybrid laser/microwave control schemes often used in NV systems.
Optical Surface QualityAdvanced polishing services achieving $R_a < 1$ nm for SCD and $R_a < 5$ nm for inch-size PCD.Minimizes optical scattering and loss, ensuring efficient coupling of the driving lasers ($\Omega_1, \Omega_2$) to the NV centers.

6CCVD maintains an in-house team of PhD-level material scientists and engineers specializing in MPCVD diamond for quantum applications. We offer comprehensive support for projects involving high-speed quantum control and NV center fabrication. Our team can assist researchers in selecting the optimal diamond grade, orientation, and processing parameters (e.g., surface termination, doping levels) necessary to achieve target coherence times and optical performance for similar quantum sensing and quantum simulation projects.

For custom specifications or material consultation, visit 6ccvd.com or contact our engineering team directly.

View Original Abstract

Abstract Quantum information processing often requires the preparation of arbitrary quantum states, such as all the states on the Bloch sphere for two-level systems. While numerical optimization can prepare individual target states, they lack the ability to find general control protocols that can generate many different target states. Here, we demonstrate global quantum control by preparing a continuous set of states with deep reinforcement learning. The protocols are represented using neural networks, which automatically groups the protocols into similar types, which could be useful for finding classes of protocols and extracting physical insights. As application, we generate arbitrary superposition states for the electron spin in complex multi-level nitrogen-vacancy centers, revealing classes of protocols characterized by specific preparation timescales. Our method could help improve control of near-term quantum computers, quantum sensing devices and quantum simulations.