B.S. Applied Statistics student at Anhui University, expected 2027.
Data Analysis / Operations
Applied statistics student focused on data workflows, AI annotation platforms, and research operations.
Strong foundation in data engineering and AI toolchain development, with experience in Python, SQL, data cleaning, structured processing, million-scale database construction, Label Studio private deployment, Chinese localization, multimodal annotation, and model output evaluation.
About
Data, platform, and research collaboration experience
Experience supporting research-oriented data workflows, including collection, cleaning, filtering, analysis, annotation, evaluation, and technical documentation.
Has participated in large-model-related research, private AI annotation platform deployment, and summer research at the National University of Singapore.
Education
Applied statistics and analysis foundation
Anhui University
B.S. in Applied Statistics
2023 - 2027
Relevant coursework
- Achieved strong academic performance in Java Programming, Data Structures, and mathematics-related courses.
- Completed interdisciplinary training in artificial intelligence, data science, machine learning, and deep learning applications.
- Participated in a summer research program at the National University of Singapore.
- Published one ICCGV (EI) conference paper as a co-first author.
Experience
Research and enterprise data internships
Work spanning large-model data, annotation platforms, quality assessment, and technical documentation.
2025.10 - 2026.04
Research Assistant
Institute of Information Engineering, Chinese Academy of Sciences
- Supported end-to-end data workflows for large-model-related research tasks, including data collection, cleaning, filtering, processing, and analysis.
- Participated in private deployment and Chinese localization of the Label Studio annotation platform for secure intranet research environments.
- Contributed to multimodal data annotation and model output evaluation.
- Prepared deployment guides, operation manuals, and other technical documentation.
2024.07 - 2024.08
Data Analysis Intern
iFlytek Co., Ltd.
- Provided data analysis and quality assessment support for large-model-related tasks.
- Analyzed, filtered, cleaned, and organized model-generated data for training and evaluation workflows.
- Assisted with data quality checks and usability improvement; supported workflows with an accuracy level of approximately 80%.
- Participated in procurement data information management and maintained related data records.
2023.12 - 2024.02
Data Processing Intern
iFlytek Co., Ltd.
- Handled data sorting, cleaning, classification, and formatting for large-model-related tasks.
- Performed data screening, deduplication, and structured processing to support model training and data analysis.
- Assisted in preliminary checking of model outputs and documented data-related issues.
- Supported internal data documentation and information management.
Projects
Data engineering, annotation platform, and model evaluation projects
Data Engineer
Million-Scale Medical Database End-to-End Construction
- Contributed to a high-quality medical database for precision medicine research, covering data acquisition, cleaning, de-identification, structuring, version management, and quality control.
- Integrated heterogeneous data from electronic medical records, laboratory reports, and other sources.
- Established standardized field-mapping rules and quality control indicators for over one million sensitive medical records.
- Introduced automated validation mechanisms and missing-value handling strategies, achieving a data completeness rate of over 96% and annotation consistency above 94%.
Platform Lead
Label Studio Private Deployment and Chinese Localization
- Led private deployment of Label Studio for a secure intranet research environment using Docker.
- Completed deep Chinese localization across interface text, interaction prompts, error messages, and backend management modules, with localization coverage above 98%.
- Conducted full-function testing, cross-browser compatibility verification, and Chinese display optimization.
- Prepared deployment and annotation operation documentation.
- Built a collaborative workflow combining manual annotation with multi-model pre-annotation; the related methodology contributed to one SCI paper currently under review.
Researcher, National University of Singapore
Summer Research Program
- Participated in research focused on local deployment of large models, visual question answering, and model output evaluation.
- Deployed multiple large models, including Flamingo-3B, Flamingo-9B, Qwen-13B, and ChatGPT-MINI.
- Used deployed models for VQA tasks and developed scripts for multi-dimensional scoring and analysis of generated outputs.
- Collaborated with the research team to complete and publish a related paper.
Skills
Skill set for data analysis and operations roles
Programming & Data Processing
Database & Data Engineering
AI & Machine Learning Tools
Statistics & Analysis
Documentation & Collaboration
Resume
Download the public resume
Downloads are generated from public-facing content and do not link to the original source PDFs.
Public resume
Includes education, experience, projects, skills, and public email; omits phone number, age, and gender.
Contact
Reach out through the public email
Only contact details suitable for public publishing are shown.
Preferred city
Hefei