B.S. Applied Statistics student at Anhui University, expected 2027.
Data Analyst / Data Developer
Applied statistics undergraduate building data workflows, IT tool maps, and AI research platforms.
Hands-on experience in data engineering, data analysis, AI data workflows, IT tool documentation, medical database construction, and Docker-based private deployment and localization of open-source AI platforms such as Label Studio.

Preferred city
Hefei
About
Data, platform, and research collaboration experience
Experience supporting data engineering, data analysis, AI data workflows, IT tool mapping, documentation workflows, and research toolchain deployment.
Has contributed to IT tool and documentation workflow mapping at Volkswagen (China) Technology Co., Ltd., Chinese Academy of Sciences research data workflows, iFLYTEK large-model data support, and summer research at the National University of Singapore.
English can be used as a working language; English proficiency meets CEFR C1, with IELTS overall 7.0 (Listening 8.5, Reading 6.5, Writing 6.0, Speaking 6.0).
Education
Applied statistics and analysis foundation
Anhui University
B.S. in Applied Statistics
2023 - 2027
Relevant coursework
- Achieved strong performance in Java Programming, Data Structures, and mathematics-related courses.
- Completed university-partnered AI and data science programs covering machine learning, data mining, and deep learning applications.
- Participated in a summer research program at the National University of Singapore.
- Co-first author of one EI-indexed ICCGV conference paper.
Experience
Enterprise data development and research data experience
Work spanning IT tool mapping, documentation workflows, large-model data support, annotation platforms, and research data workflows.
May 2026 - Present
Data Development Intern
Volkswagen (China) Technology Co., Ltd.
- Use English as one of the working languages for cross-department IT tool, permission, and project documentation communication.
- Mapped IT tools used across Powerhouse departments, including tool connections, access methods, and role-based permissions.
- Identified project documents and the programs used to create them, forming a prioritized overview of IT tools and documentation workflows.
- Analyzed gaps and redundancies across tools and documents, then proposed integration solutions for smoother workflows and fewer duplicated tools.
- Supported IT coordination and tool-access applications for platforms such as Polarion, Feishu, and CATIA; prepared user instructions for tool requests and role management.
Oct 2025 - Apr 2026
Research Assistant
Institute of Information Engineering, Chinese Academy of Sciences
- Supported end-to-end research data workflows, including data acquisition, cleaning, desensitization, structuring, quality control, and version management.
- Contributed to local deployment and Chinese localization of Label Studio for secure in-house data annotation and AI research workflows.
Jul 2024 - Aug 2024
Data Analysis Intern
iFlytek Co., Ltd.
- Provided full-process data support for large-model workflows, including data analysis, selection, cleaning, and reuse in model development.
- Managed procurement-related data and assisted with data organization for internal business and AI-related tasks.
Dec 2023 - Feb 2024
Data Processing Intern
iFlytek Co., Ltd.
- Processed and cleaned large-model-generated data, helping prepare usable datasets for subsequent analysis and model input.
- Assisted with data selection, cleaning, and procurement-related information management.
Projects
Data engineering, annotation platform, and model evaluation projects
Data Engineer
Million-Scale Medical Database End-to-End Construction
Dec 2025 - Present
- Built a high-quality medical database from scratch for precision-medicine research, covering data collection, cleaning, desensitization, structuring, version control, and quality assurance.
- Integrated heterogeneous data such as electronic medical records and laboratory-test records; designed standardized field-mapping rules and quality-control indicators for over one million sensitive medical records.
- Introduced automated validation and missing-value imputation strategies, raising data completeness to 96%+ and annotation consistency to 94%+.
- Provided a reliable data foundation for clinical analysis and AI model training.
Platform Lead
Label Studio Private Deployment and Chinese Localization
Oct 2025 - Present
- Led private deployment of Label Studio in an internal research environment using Docker, enabling stable offline service deployment and keeping data within the secure domain.
- Completed full-stack Chinese localization across front-end interfaces, prompts, error messages, and admin modules, achieving over 98% localization coverage.
- Conducted functional testing, multi-browser compatibility checks, and Chinese display optimization; fixed timezone, encoding, and font-rendering issues.
- Wrote deployment and annotation user guides and built a manual plus multi-model pre-annotation workflow related to one SCI paper under review.
Researcher, National University of Singapore
Summer Research Program
Summer 2025
- Deployed multiple large models locally, including Flamingo-3B, Flamingo-9B, Qwen-13B, and ChatGPT-Mini, and used them for visual question answering tasks.
- Developed scripts to evaluate generated VQA outputs across multiple dimensions.
- Contributed to a team paper based on the research results.
Skills
Skill set for data analysis and data development
Programming & Data Processing
Database & Data Engineering
AI & Machine Learning Tools
Statistics & Analysis
Workflow & Documentation
Resume
Download the public resume
Downloads are generated from public-facing content and do not link to the original source PDFs.
Public resume
Includes education, experience, projects, skills, and public email; omits phone number, age, and gender.
Contact
Reach out through the public email
Only contact details suitable for public publishing are shown.
Preferred city
Hefei