pip
.connect_server
simply creates an insecure channel with the server. This is meant as a quick way to test without requiring specific hardware, so do not use the simulation mode in production.upload_model
method, we need to specify the ONNX file, the shape of the inputs, and the type of data. Here because we run a BERT model, the inputs would be integers to represent the different tokens sent to the model.