Codes
Medusa. Medusa: a easy-to-use framework which accelerates LLM generation through multiple light-weighted decoding head.
Key technique: , LLM, speculative decoding.MaxK-GNN. Implementation of “MaxK-GNN: Towards Theoretical Speed Limits for Accelerating Graph Neural Networks Training.”
Key technique: Graph neural network, GPU kernel design, MaxK nonlinearity.LoT. Implementation of “Learning from Teaching Regularization: Generalizable Correlations Should be Easy to Imitate.”
Key technique: Generalization, regularization.LinGCN. Implementation of “LinGCN: Structural Linearized Graph Convolutional Network for Homomorphically Encrypted Inference.”
Key technique: Privacy-Preserving Machine Learning, efficient private inference, machine learning as a service, homomorphic encryption, non-linear pruning, ST-GCN.AutoReP. Implementation of AutoReP ReLU replacement algorithm on CIFAR-10, CIFAR-100, TinyImageNet and ImageNet.
Key technique: L0 replacement algorithm, polynomial activation function, private inference.Accel-GCN. Implementation of SpMM for GCNs on GPU platform.
Key technique: degree sorting, block-level partition, combined warp.PASNet (non-fpga). Implementation of PASNet algorithm (without fpga accerlation), with evaluation code on PASNet-A (ResNet18 as backbone), PASNet-B and PASNet-C (ResNet50 as backbone), PASNet-D (MobileNetV2 as backbone) on imagenet.
Key technique: Neural architecuture search, 2 party-computation, private inferenceSparseGNN. Implementation of different sparsification algorithm on GNNs, such as sparse training, ADMM, SLR.
Key technique: Sparse Training, ADMM, SLR, GNN training.