Real-time decoding is crucial to fault-tolerant quantum computers. By enabling decoders to operate with low latency concurrently with a quantum processing unit (QPU), we are able to apply corrections to the device throughout the coherence time. This prevents errors from accumulating, which reduces the worth of results received. We are able to do that online, with an actual quantum device, or offline, with a simulated quantum processor.
To assist solve these problems and enable research into higher solutions, NVIDIA CUDA-Q QEC version 0.5.0 includes a variety of improvements. These include support for online real-time decoding, latest GPU-accelerated algorithmic decoders, infrastructure for high-performance AI decoder inference, sliding window decoder support, and more Pythonic interfaces.
We’ll cover all of those improvements on this post and dive into how you should utilize them to speed up your quantum error correction research, or operationalize real-time decoding together with your quantum computer.
Real-time decoding is real with CUDA-Q QEC
Users can perform this in a four-stage workflow. So as, these are: DEM generation, decoder configuration, decoder loading and initialization, and real-time decoding.
First, we characterize how the device errors behave during operation. Using a helper function, we are able to generate the detector error model (DEM) from a quantum code, noise model, and circuit parameters. The function will generate a whole DEM that maps error mechanisms to syndrome patterns.
# Step 1: Generate detector error model
print("Step 1: Generating DEM...")
cudaq.set_target("stim")
noise = cudaq.NoiseModel()
noise.add_all_qubit_channel("x", cudaq.Depolarization2(0.01), 1)
dem = qec.z_dem_from_memory_circuit(code, qec.operation.prep0, 3, noise)
The subsequent step is to decide on a decoder and configure it. We’ll discuss latest decoders in greater detail in the next sections.
Using the DEM, the user configures the decoder after which saves this configuration to a YAML file. This file ensures that the decoders can appropriately interpret the syndrome measurements.
# Create decoder config
config = qec.decoder_config()
config.id = 0
config.type = "nv-qldpc-decoder"
config.block_size = dem.detector_error_matrix.shape[1]
. . .
# take a look at nvidia.github.io/cudaqx/examples_rst/qec/realtime_decoding.html
. . .
Before circuit execution, the user loads the YAML file. CUDA-Q QEC interprets the data, sets up the suitable implementation within the decoder, and registers it with the CUDA-Q runtime.
# Save decoder config
with open("config.yaml", 'w') as f:
f.write(config.to_yaml_str(200))
Now, users can begin executing quantum circuits. Inside CUDA-Q kernels, the decoding API interacts with the decoders. Because the stabilizers of the logical qubits are measured, syndromes are enqueued to the corresponding decoder, which processes them. When corrections are needed, the decoder suggests operations to use to the logical qubits.
# Load config and run circuit
qec.configure_decoders_from_file("config.yaml")
run_result = cudaq.run(qec_circuit, shots_count=10)
GPU-accelerated RelayBP
A recently developed decoder algorithm helps solve the pitfalls of belief propagation decoders, a preferred class of quantum low-density parity check algorithmic decoders. BP+OSD (Belief Propagation with Ordered Statistics Decoding) relies on a GPU-accelerated BP decoder after which uses an Ordered Statistics Post-Processing Algorithm on CPU. If BP fails, OSD kicks in. That is positive, but makes it hard to optimize and parallelize for the low latency needed to enable real-time error decoding.
RelayBP modifies BP methods with the concept of memory strengths, at each node of a graph, and controls how much each node remembers or forgets past messages. This dampens or breaks the harmful symmetries that typically trap BP, stopping it from converging.
Users can instantiate a RelayBP decoder easily with a couple of lines of code, outlined below.
import numpy as np
import cudaq_qec as qec
# Easy 3x7 parity check matrix for demonstration
H_list = [[1, 0, 0, 1, 0, 1, 1], [0, 1, 0, 1, 1, 0, 1],
[0, 0, 1, 0, 1, 1, 1]]
H = np.array(H_list, dtype=np.uint8)
# Configure relay parameters
srelay_config = {
'pre_iter': 5, # Run 5 iterations with gamma0 before relay legs
'num_sets': 3, # Use 3 relay legs
'stopping_criterion': 'FirstConv' # Stop after first convergence
}
# Create a decoder with Relay-BP
decoder_relay = qec.get_decoder("nv-qldpc-decoder",
H,
use_sparsity=True,
bp_method=3,
composition=1,
max_iterations=50,
gamma0=0.3,
gamma_dist=[0.1, 0.5],
srelay_config=srelay_config,
bp_seed=42)
print(" Created decoder with Relay-BP (gamma_dist, FirstConv stopping)")
# Decode a syndrome
syndrome = np.array([1, 0, 1], dtype=np.uint8)
decoded_result = decoder_relay.decode(syndrome)
AI decoder inference
AI decoders have gotten increasingly popular for handling specific error models, offering higher accuracy or latency than algorithmic decoders.
Users can develop AI decoders by generating training data, training a model, and exporting the model to ONNX. Once that is complete, use the CUDA-Q QEC NVIDIA TensorRT-based AI decoder inference engine to operate low-latency AI decoders.
CUDA-Q QEC recently introduced infrastructure for integrated AI decoder inference with offline decoding. Which means it’s now easy to run any AI decoder saved to an ONNX file with CUDA-Q QEC and an emulated quantum computer.
import cudaq_qec as qec
import numpy as np
# Note: The AI decoder doesn't use the parity check matrix.
# A placeholder matrix is provided here to satisfy the API.
H = np.array([[1, 0, 0, 1, 0, 1, 1],
[0, 1, 0, 1, 1, 0, 1],
[0, 0, 1, 0, 1, 1, 1]], dtype=np.uint8)
# Create TensorRT decoder from ONNX model
decoder = qec.get_decoder("trt_decoder", H,
onnx_load_path="ai_decoder.onnx")
# Decode a syndrome
syndrome = np.array([1.0, 0.0, 1.0], dtype=np.float32)
result = decoder.decode(syndrome)
print(f"Predicted error: {result}")
We also offer a variety of recommendations to scale back the initialization time by creating pre-built TensorRT engines. With ONNX files supporting a variety of precisions (int8, fp8, fp16, bf16, and tf32) you possibly can explore a variety of model and hardware combos to optimize AI decoder operationalization.
Sliding window decoding
Sliding window decoders enable a decoder to handle circuit-level noise across multiple syndrome extraction rounds. These decoders process the syndrome before the entire measurement sequence is received, which may help reduce the general latency. The tradeoff is that this will increase logical error rates.
Exploring how and when to make use of this tool relies on the noise model, error correcting code parameters, and the latency budget of a given quantum processor. With the introduction of the sliding window decoder in 0.5.0, users can now perform experiments using some other CUDA-Q decoder because the “inner” decoder. Moreover, users can vary the window size with easy parameter changes.
import cudaq
import cudaq_qec as qec
import numpy as np
cudaq.set_target('stim')
num_rounds = 5
code = qec.get_code('surface_code', distance=num_rounds)
noise = cudaq.NoiseModel()
noise.add_all_qubit_channel("x", cudaq.Depolarization2(0.001), 1)
statePrep = qec.operation.prep0
dem = qec.z_dem_from_memory_circuit(code, statePrep, num_rounds, noise)
inner_decoder_params = {'use_osd': True, 'max_iterations': 50, 'use_sparsity': True}
opts = {
'error_rate_vec': np.array(dem.error_rates),
'window_size': 1,
'num_syndromes_per_round': dem.detector_error_matrix.shape[0] // num_rounds,
'inner_decoder_name': 'nv-qldpc-decoder',
'inner_decoder_params': inner_decoder_params,
}
swdec = qec.get_decoder('sliding_window', dem.detector_error_matrix, **opts)
Each syndrome extraction round must produce a relentless variety of measurements. The decoder will make no assumptions in regards to the temporal correlations or periodicity within the underlying noise, so users have maximal flexibility in investigating noise variations per round.
Getting began with CUDA-Q QEC
CUDA-Q QEC 0.5.0 brings a big selection of tools to quantum error correction researchers and QPU operators, to speed up research into operationalizing fault-tolerant quantum computers.
To start using the CUDA-Q QEC, you possibly can pip install cudaq-qec and see the CUDA-Q QEC documentation.
