I want to gather the following common statistics from the ONNX models.
+The number of each layer/operator,
+The kernel-size, stride, input/output tensor size of CONV operators,
+Total size of weight data which is read-only for inference workloads,
+Runtime produced and consumed data.
+Connectivity within the neural network graph.
The deliverable should be a CLI program that takes a folder of ONNX models and produces comma separated values (CSV) file(s).
The outcome should be multiple graphs that shows the empirical relationship between the statistics and runtime performance of multiple models.
Try to make timing measurements as accurate as possible, meaning:
Do not run in the virtual machine unless you are using some kind of simulator
Limit the number of background applications (close browsers, video-playback, ....)
If you are using C/C++, please avoid using malloc-free (or new-delete) inside measurements as much as possible
Similarly, do not open-close and/or read-write from files inside measurements
In the case there is too much noise in your system, please measure multiple times and report statistics rather than absolutes.
To reduce the effects of caches, you should run N times your application successively and discard the first 5-10 measurements (This is sometimes called cache-warming).