shiqiwang

SHIQI WANG

I am now a research scientist at Meta, working on coding and reasoning capability for Llama. Previously, I was a research scientist at AWS AI Lab, as one of the founding members for Amazon CodeWhisperer (now Amazon Q inline). I led multiple scientific initiatives for Code LLMs including finetuning, robustness, benchmarking, and bug fixing into both production and publications. I obtained my Ph.D. at the Department of Computer Science at Columbia University, advised by Professor Suman Jana. Before coming to Columbia, I received B.Eng. from Shanghai Jiao Tong University in 2017.

Mail: tcwangshiqi (AT) meta.com; wshiqi (AT) amazon.com; tcwangshiqi (AT) gmail.com; tcwangshiqi (AT) cs.columbia.edu

↳ Google Scholar    ↳ Github     ↳ LinkedIn     ↳ DBLP

WORK EXPERIENCES

Meta - Llama Team (NYC, US, full-time, Oct 2024 - present).

AWS AI Lab - CodeWhisperer (NYC, US, full-time, May 2022 - Oct 2024).

Applied Science Group (ASG) - Microsoft Research (Redmond, US, 2021 summer), working with Hamid Vaezi Joze, and Wedward Wei.

Cyber Security Intelligence (CSI) Team - IBM Research (Yorktown Height, US, 2020 summer), working with Kevin Eykholt, Taesung Lee, Jiyong Jang, and Ian Molloy.

Baidu Security X-Lab (Sunnyvale, US, 2019 summer), working with Yunhan Jia, Zhenyu Zhong, Yantao Lu.




PUBLICATIONS

"Training LLMs to Better Self-Debug and Explain Code"
Nan Jiang, Xiaopeng Li, Shiqi Wang, Qiang Zhou, Soneya Binta Hossain, Baishakhi Ray, Varun Kumar, Xiaofei Ma, Anoop Deoras
38th Conference on Neural Information Processing Systems (NeurIPS 2024)

"CodeFort: Robust Training for Code Generation Models"
Yuhao Zhang, Shiqi Wang, Haifeng Qian, Zijian Wang, Mingyue Shang, Linbo Liu, Sanjay Krishna Gouda, Baishakhi Ray, Murali Krishna Ramanathan, Xiaofei Ma, Anoop Deoras
Empirical Methods in Natural Language Processing (EMNLP Findings 2024)

"Reasoning and Planning with Large Language Models in Code Development"
Hao Ding, Ziwei Fan, Ingo Guhring, Gaurav Gupta, Wooseok Ha, Luke Huan, Linbo Liu, Behrooz Omidvar-Tehrani, Shiqi Wang, Hao Zhou (authors arranged alphabetically)
Survey Paper & Lecture-style Tutorial at KDD 2024

"Token Alignment via Character Matching for Subword Completion"
Ben Athiwaratkun, Shiqi Wang, Mingyue Shang, Yuchen Tian, Zijian Wang, Sujan Kumar Gonugondla, Sanjay Krishna Gouda, Rob Kwiatowski, Ramesh Nallapati, Bing Xiang
62nd Annual Meeting of the Association for Computational Linguistics (ACL Findings 2024)

"Shifting Attention to Relevance: Towards the Uncertainty Estimation of Large Language Models"
Jinhao Duan, Hao Cheng, Shiqi Wang, Chenan Wang, Alex Zavalny, Renjing Xu, Bhavya Kailkhura, Kaidi Xu
62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024)

"Code-Aware Prompting: A study of Coverage Guided Test Generation in Regression Setting using LLM"
Gabriel Ryan, Siddhartha Jain, Mingyue Shang, Shiqi Wang, Xiaofei Ma, Murali Krishna Ramanathan, Baishakhi Ray
Symposium on the Foundations of Software Engineering (FSE 2024)

"ReTA: Recursively Thinking Ahead to Improve the Strategic Reasoning of Large Language Models"
Jinhao Duan, Shiqi Wang, James Diffenderfer, Lichao Sun, Tianlong Chen, Bhavya Kailkhura, Kaidi Xu
Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024)

"ReCode: Robustness Evaluation of Code Generation Models"
Shiqi Wang*, Zheng Li*, Haifeng Qian, Mingyue Shang, Chenghao Yang, Zijian Wang, Varun Kumar, Samson Tan, Baishakhi Ray, Parminder Bhatia, Ramesh Nallapati, Murali Krishna Ramanathan, Dan Roth, Bing Xiang (* indicates equal contribution)
61st Annual Meeting of the Association for Computational Linguistics (ACL 2023)
ACL 2023 best paper recommendation in meta review
also accepted to ICLR 2023 Deep Learning for Code (DL4C) workshop
Science Blog post regarding ReCode

"Greener yet Powerful: Taming Large Code Generation Models with Quantization"
Xiaokai Wei, Sujan Gonugondla, Shiqi Wang, Wasi Ahmad, Baishakhi Ray, Haifeng Qian, Xiaopeng Li, Varun Kumar, Zijian Wang, Yuchen Tian, Qing Sun, Ben Athiwaratkun, Mingyue Shang, Murali Krishna Ramanathan, Parminder Bhatia, Bing Xiang
Symposium on the Foundations of Software Engineering (FSE 2023)

"Are diffusion models vulnerable to membership inference attacks?"
Jinhao Duan, Fei Kong, Shiqi Wang, Xiaoshuang Shi, Kaidi Xu
40th International Conference on Machine Learning (ICML 2023)

"Multi-lingual Evaluation of Code Generation Models"
Ben Athiwaratkun, Sanjay Gouda, Zijian Wang, Xiaopeng Li, Yuchen Tian, Ming Tan, Wasi Ahmad, Shiqi Wang, Qing Sun, Mingyue Shang, Sujan Kumar Gonugondla, Hantian Ding, Varun Kumar, Nathan Fulton, Arash Farahani, Siddhartha Jain, Robert Giaquinto, Haifeng Qian, Murali Krishna Ramanathan, Ramesh Nallapati, Baishakhi Ray, Parminder Bhatia, Sudipta Sengupta, Dan Roth, Bing Xiang
10th International Conference on Learning Representations (ICLR 2023)

"General Cutting Planes for Bound-Propagation-Based Neural Network Verification"
Huan Zhang*, Shiqi Wang*, Kaidi Xu*, Linyi Li, Bo Li, Suman Jana, Cho-Jui Hsieh, Zico Kolter (* indicates equal contribution)
36th Conference on Neural Information Processing Systems (NeurIPS 2022)[GCP-CROWN code]

"A Branch and Bound Framework for Stronger Adversarial Attacks of ReLU Networks"
Huan Zhang*, Shiqi Wang*, Kaidi Xu, Yihan Wang, Suman Jana, Cho-Jui Hsieh, Zico Kolter (* indicates equal contribution)
39th International Conference on Machine Learning (ICML 2022)

"Beta-CROWN: Efficient Bound Propagation with Per-neuron Split Constraints for Complete and Incomplete Neural Network Verification"
Shiqi Wang*, Huan Zhang*, Kaidi Xu*, Xue Lin, Suman Jana, Cho-Jui Hsieh, J. Zico Kolter (* indicates equal contribution)
ICML 2021 Workshop AML
35th Conference on Neural Information Processing Systems (NeurIPS 2021)[beta-CROWN code]
The global winner of 2nd International Verification of Neural Networks Competition (VNN-COMP 2021)[alpha-beta-CROWN code]

"Learning Security Classifiers with Verified Global Robustness Properties"
Yizheng Chen, Shiqi Wang, Yue Qin, Xiaojing Liao, Suman Jana, and David Wagner
28th ACM Conference on Computer and Communications Security (CCS 2021) [code]
Best Paper Award Runner-Up

"Fast and Complete: Enabling Complete Neural Network Verification with Rapid and Massively Parallel Incomplete Verifiers"
Kaidi Xu*, Huan Zhang*, Shiqi Wang, Yihan Wang, Suman Jana, Xue Lin, Cho-Jui Hsieh (* indicates equal contribution)
9th International Conference on Learning Representations (ICLR 2021 )[poster]

"Adaptive Verifiable Training Using Pairwise Class Similarity"
Shiqi Wang, Kevin Eykholt, Taesung Lee, Jiyong Jang, Ian Molloy
35th AAAI Conference on Artifical Intelligence (AAAI 2021, acceptance rate: 21%)[poster]

"Cost-Aware Robust Tree Ensembles for Security Applications"
Yizheng Chen, Shiqi Wang, Weifan Jiang, Asaf Cidon, Suman Jana
30th USENIX Security Symposium (Usenix Security 2021 ).

"On Pruning Adversarially Robust Neural Networks"
Vikash Sehwag, Shiqi Wang, Prateek Mittal, Suman Jana
34th Conference on Neural Information Processing Systems (NeurIPS 2020, acceptance rate: 20.1%)[HYDRA code][poster]
↳ also appears in ICLR 2020 workshop on Trustworthy ML

"On Training Robust PDF Malware Classifiers"
Yizheng Chen, Shiqi Wang, Dongdong She, Suman Jana
29th USENIX Security Symposium (Usenix Security 2020).

"Enhancing Gradient-based Attacks with Symbolic Intervals"
Shiqi Wang, Yizheng Chen, Ahmed Abdou, Suman Jana
ICML 2019 Workshop on Security and Privacy of Machine Learning (contributed talk in SPML 2019, acceptance rate: 4/68=5.8%)[Interval attacks code][poster]
↳ Interval attacks appear on MadryLab MNIST Challenge Leaderboard.

"Efficient Formal Safety Analysis of Neural Networks"
Shiqi Wang, Kexin Pei, Justin Whitehouse, Junfeng Yang, Suman Jana
32th Conference on Neural Information Processing Systems (NeurIPS 2018 acceptance rate: 20.8%)[Neurify code][poster][video].

"Formal Security Analysis of Neural Networks using Symbolic Intervals"
Shiqi Wang, Kexin Pei, Justin Whitehouse, Junfeng Yang, Suman Jana
27th USENIX Security Symposium (USENIX Security 2018, acceptance rate: 19%)[ReluVal code][video].

"ContexIot: Towards Providing Contextual Integrity to Appified IoT platforms"
Yunhan Jack Jia, Qi Alfred Chen, Shiqi Wang, Amir Rahmati, Earlence Fernandes, Z. Morley Mao, Atul Prakash
Proceedings of the 21st Network and Distributed System Security Symposium (NDSS 2017, acceptance rate: 16%). (Undergraduate Research)

"Defense against impersonating attackers: An efficient RFID mutual authentication protocol based on standard"
Shiqi Wang, Linsen Li, Gaosheng Chen, Tao Chen, Zeming Wang
8th International Conference on Information and Communication Systems (ICICS 2017). (Undergraduate Research)

"Improved Group Management Protocol of RFID password Method"
Tao Chen, Linsen Li, Shiqi Wang, Gaosheng Chen, Zeming Wang
International Conference on Internet of Things, Data and Cloud Computing (ICC 2017). (Undergraduate Research)




TOOLS

I am one of the main contributors to alpha-beta-CROWN - the best neural network verifier and the global winner of VNN-COMP 2021-2023.
International Verification of Neural Networks Competition VNN-COMP
[alpha-beta-CROWN code] [competition report] [certificate] [alpha-beta-CROWN logo]

I proposed and built ReluVal and Neurify, one of the first complete neural network verifiers in the field.
[ReluVal][Neurify]




PREPRINTS

"Horizon-Length Prediction: Advancing Fill-in-the-Middle Capabilities for Code Generation with Lookahead Planning"
Yifeng Ding, Hantian Ding, Shiqi Wang, Qing Sun, Varun Kumar, Zijian Wang

"Towards Understanding Fast Adversarial Training"
Bai Li, Shiqi Wang, Suman Jana, Lawrence Carin

"MixTrain: Scalable Training of Verifiably Robust Neural Networks"
Shiqi Wang, Yizheng Chen, Ahmed Abdou, Suman Jana
Our symbolic interval analysis library has been incorporated into Perceptron, an adversarial toolbox for benchmarking various safety and security properties of deep neural networks at Baidu Security X-Lab.

"Towards Practical Lottery Ticket Hypothesis for Adversarial Training"
Bai Li*,Shiqi Wang*, Yunhan Jia, Yantao Lu, Zhenyu Zhong, Lawrence Carin, Suman Jana (* indicates equal contribution)

"Towards Compact and Robust Deep Neural Networks"
Vikash Sehwag*, Shiqi Wang*, Prateek Mittal, Suman Jana (* indicates equal contribution)




PATENTS

"Book management method based on color rectangular code and color rectangular code label" (CN106919966A).
Linsen Li, Shiqi Wang, Junhua Tang, Yue Wu, Jianhua Li (Undergraduate Research)




Ph.D. Thesis

"Efficient Neural Network Verification Using Branch and Bound".




ACADEMIA SERVICES

Conference PC/Reviewer: NeurIPS (2020, 2021, 2022, 2023, 2024), ICML (2021, 2022, 2023, 2024), ICLR (2022, 2023, 2024), ARR (2024), COLM (2024), AAAI (2022, 2023), AISec (2021, 2022, 2023, 2024), WFVML 2022, MAPS 2023.
Journal Reviewer: TMLR; TNNLS; SAS; Algorithms; Entropy.
KDD 2024 Tutorial Organizers: Reasoning and Planning with Large Language Models in Code Development [tutorial webpage]
AAAI 2022 Tutorial Organizers: Formal Verification of Deep Neural Networks: Theory and Practice [tutorial webpage]
Workshop Organizers: ATVA 2021 Workshop on Security and Reliability of Machine Learning (SRML).




MORE ABOUT ME

I am a national second-level basketball player basketball .
I am also a Physique athlete since 2022 bodybuilding.
If you love Shiba Inu nikki , definitely worth taking a look at Nikki's Instagram.
If you love flower design, check my wife's page @wildpetals_ny!




LINKS

↳ Google Scholar    ↳ Github    ↳ DBLP     ↳ Facebook    ↳ LinkedIn     ↳ Instagram