Performance History
Overview of test results for the most recent commits:
Performance for matmul_512_512_4096_bf16_f32_O2_npu1_4col_benchmark
Total ops: 2147483648
Number of cores: 16
Performance for matmul_512_512_4096_bf16_f32_O2_npu1_4col_outline_benchmark
Total ops: 2147483648
Number of cores: 16
Performance for matmul_512_512_4096_bf16_f32_O3_npu1_4col_benchmark
Total ops: 2147483648
Number of cores: 16
Performance for matmul_512_512_4096_bf16_f32_O3_npu1_4col_outline_benchmark
Total ops: 2147483648
Number of cores: 16
Performance for matmul_512_512_4096_bf16_f32_O3_npu1_4col_outline_ukernel_benchmark
Total ops: 2147483648
Number of cores: 16
Performance for matmul_512_4096_512_bf16_f32_O3_npu1_4col_outline_benchmark
Total ops: 2147483648
Number of cores: 16
Performance for matmul_512_4096_512_bf16_f32_O3_npu1_4col_outline_ukernel_benchmark
Total ops: 2147483648
Number of cores: 16
Performance for matmul_transpose_b_512_4096_512_bf16_f32_O3_npu1_4col_outline_benchmark
Total ops: 2147483648
Number of cores: 16
Performance for matmul_4096_512_512_bf16_f32_O3_npu1_4col_outline_benchmark
Total ops: 2147483648
Number of cores: 16
Performance for matmul_4096_512_512_bf16_f32_O3_npu1_4col_outline_ukernel_benchmark
Total ops: 2147483648
Number of cores: 16
Performance for matmul_transpose_a_4096_512_512_bf16_f32_O3_npu1_4col_outline_benchmark
Total ops: 2147483648
Number of cores: 16
Performance for matmul_4096_512_512_bf16_f32_O3_npu1_4col_outline_empty_benchmark
Total ops: 0
Number of cores: 16
Performance for matmul_512_4096_512_bf16_f32_O3_npu1_4col_outline_4_level_tiling_benchmark
Total ops: 2147483648
Number of cores: 16
Performance for matmul_512_4096_512_bf16_f32_O3_npu1_4col_outline_4_level_tiling_ukernel_benchmark
Total ops: 2147483648
Number of cores: 16
Performance for matmul4d_512_4096_512_bf16_f32_O3_npu1_4col_outline_4_level_tiling_ukernel_benchmark
Total ops: 2147483648
Number of cores: 16
Performance for matmul_512_4096_512_bf16_f32_O3_npu1_4col_outline_empty_4_level_tiling_benchmark
Total ops: 0
Number of cores: 16
Performance for matmul_512_512_512_bf16_f32_O2_npu1_4col_callrepl_100_outline_benchmark
Total ops: 26843545600
Number of cores: 1
Performance for matmul_512_512_512_bf16_f32_O3_npu1_4col_callrepl_100_outline_benchmark
Total ops: 26843545600
Number of cores: 1
Performance for matmul_const_bias_ctrlpkt_1024_1024_1024_i8_i32_benchmark_bias_2
Total ops: 26843545600
Number of cores: 1
Performance for matmul_512_512_4096_bf16_f32_O3_npu1_4col_outline_packet_flow_benchmark
Total ops: 2147483648
Number of cores: 16
Performance for matmul_512_4096_512_bf16_f32_O3_npu1_4col_outline_packet_flow_benchmark
Total ops: 2147483648
Number of cores: 16
Performance for matmul_4096_512_512_bf16_f32_O3_npu1_4col_outline_packet_flow_benchmark
Total ops: 2147483648
Number of cores: 16
Performance for matmul_512_512_512_bf16_f32_O3_npu1_4col_callrepl_100_outline_ukernel_benchmark
Total ops: 26843545600
Number of cores: 1
Performance for matmul_1024_1024_1024_i8_i32_npu1_4col_reconfigure_only_benchmark
Performance for matmul_1024_1024_1024_i8_i32_npu1_4col_pdi_load_only_benchmark