From: hu-po

Quantum computing describes quantum information typically represented by an N-qubit system within a Hilbert space. The dimensionality of this space grows exponentially as 2^N with the number of qubits [00:23:10]. Operations on these quantum states are mathematically described by applying a quantum circuit or a quantum gate [00:35:30].

Quantum Gates and Operators

A quantum gate is a unitary operator that transforms a quantum state [00:35:30]. Examples include:

  • Hadamard (H) Gate: This fundamental single-qubit gate is crucial for generating superposition states. Applying the Hadamard gate to a qubit in state 0 or 1 places it into an equal superposition of both, meaning it’s neither strictly 0 nor 1 but a combination [00:37:47]. Applying it twice returns the qubit to its original state, indicating it is its own inverse [00:38:47].
  • Pauli Operators (X, Y, Z): These are also unitary operators. For example, the Pauli X gate performs a rotation of Pi radians around the X-axis [01:42:41], while the Pauli Z gate flips a qubit’s state from 0 to 1 and vice versa [01:08:12].
  • Rotation Gates (Rx, Ry, Rz): These allow for rotations around specific axes (X, Y, Z) by a parameterized angle (Theta), offering more fine-grained control than a simple flip [01:42:41].
  • Controlled-NOT (CNOT) Gate: This is a two-qubit gate where one qubit acts as a control and the other as a target. If the control qubit is in state 1, the target qubit is flipped; otherwise, it remains unchanged [01:27:40]. CNOT gates are instrumental in creating and manipulating entanglement between qubits [01:29:11].

Ansatz in Quantum Neural Networks

In quantum computing, an “ansatz” (German for “initial condition” or “assumption”) refers to a quantum circuit with a predetermined geometry (connectivity and gates) that expresses a time-evolution unitary operator [00:50:31]. They are essentially parameterized quantum circuits (PQCs) [01:36:44].

Role in Quantum Neural Networks

In quantum self-attention neural networks (QSAN), an ansatz serves multiple purposes:

  • Data Encoding: A “U_encode” ansatz converts classical input data (e.g., text tokens) into corresponding quantum states by applying a Hadamard gate to put qubits in superposition, followed by a unitary operation based on the input data [00:59:10].
  • Queries, Keys, and Values: Separate ansatze (Uq, Uk, Uv) are used to represent the queries, keys, and values, each parameterized by learnable angles (Theta_q, Theta_k, Theta_v) [01:14:14]. These are not represented as traditional matrices but as quantum circuits [01:01:14].
  • Expressive Power: Repeated application of circuit structures (e.g., rotations and CNOT gates) within an ansatz enhances its expressive power, allowing the quantum system to explore a diverse set of states, beneficial for optimization tasks [01:25:57].

Structure of an Ansatz

A typical ansatz consists of single-qubit rotations (like Rx, Ry) and CNOT entangling gates. These gates are applied to pairs of qubits to create entanglement [01:29:06]. The repetition of these steps builds the depth of the quantum circuit [01:31:30].

Initialization and Training

Ansatz parameters (the rotation angles) are typically initialized from a Gaussian distribution with a mean of zero and a small standard deviation (e.g., 0.01) [02:11:16]. This means initial rotations are very small [02:12:24]. The optimization of these parameters is achieved through stochastic gradient descent, leveraging analytical gradients, which means that backpropagation is possible through these quantum circuits [02:03:31].

Practical Considerations and Analogies to Classical Neural Networks

  • Hybrid Approach: Quantum self-attention neural networks often employ a hybrid approach, where quantum circuits process initial data, and classical computers handle tasks like calculating attention coefficients and final predictions [00:53:59].
  • High-Dimensional Feature Spaces: Quantum computers leverage exponentially large Hilbert spaces to create high-dimensional feature spaces for word correlations, potentially uncovering hidden relationships that are difficult to find classically [02:20:09].
  • Depth and Parameters: Currently, quantum neural networks are “shallow,” meaning they cannot have too many consecutive operations due to noise propagation [01:22:22]. For instance, a QSAN model might have only a few dozen parameters, orders of magnitude less than classical deep learning models [02:13:52].
  • Noise Robustness: A critical design consideration for quantum machine learning algorithms on near-term quantum devices is robustness to noise, as external disturbances can propagate and lead to incorrect answers [00:06:03].
  • Simulations vs. Real Hardware: Most quantum machine learning research and algorithm testing are done via classical simulations rather than on actual quantum hardware due to the high cost and limited accessibility of real quantum computers [01:58:37].