Hi,
first of all, thanks for reading the post and sending your questions. Let me remind everyone that this is not a scientific paper. In this post I just want to show that the technology exists to couple pre-trained “classical” models with quantum layers. I leave to your imagination what real-life applications may look like.
To answer your questions:
- Justification, why it is even applicable and can lead to improving of model
I’m not sure what you mean here, but I believe having a deeper quantum layer may help (in the example I’m using just four qubits). There is a number of variational architectures that one can test in principle. As for why one should try out quantum layers in the first place, I believe the most honest answer in the case I proposed is “because it’s possible”. On the other hand, if your application concerns chemistry (e.g. molecular structure), there’s good evidence that quantum algorithms can outperform classical ones for large molecules.
- Explanation, what the “black-box” really is. There is no scheme even by the link.
That’s not an easy answer. What the qubits do in practice is to be in a superposition state, explore all the outcomes at the same time, and collapse to one of the possible outcomes. The procedure is repeated a number of times due to its intrinsic statistical nature. If you want to know what the qubits are actually doing, that becomes a philosophical question, I suggest you to read more about interpretations of quantum mechanics.
- Practical results. Of course at first it is impossible to beat SOTA at ImageNet or so, but at least in some synthetic task. Because this problem is pretty easy even for simple net (actually all complex work is done by encoder) and yet the quantum net fails it (may be there is missing hyperparameter tuning?).
I didn’t spend much time doing any tuning. As I said, I just wanted to show that one can put together pre-trained models with quantum layers. I believe it would be much more interesting to create a quantum embedding layer to reproduces the CBoW or skip-gram architectures. My two cents!