Skip to main content
Jianfei Xu 中文

Data Analyst / Data Developer

Applied statistics undergraduate building data workflows, IT tool maps, and AI research platforms.

Hands-on experience in data engineering, data analysis, AI data workflows, IT tool documentation, medical database construction, and Docker-based private deployment and localization of open-source AI platforms such as Label Studio.

Data Analyst Data Developer
Professional portrait of Jianfei Xu.

Preferred city

Hefei

About

Data, platform, and research collaboration experience

B.S. Applied Statistics student at Anhui University, expected 2027.

Experience supporting data engineering, data analysis, AI data workflows, IT tool mapping, documentation workflows, and research toolchain deployment.

Has contributed to IT tool and documentation workflow mapping at Volkswagen (China) Technology Co., Ltd., Chinese Academy of Sciences research data workflows, iFLYTEK large-model data support, and summer research at the National University of Singapore.

English can be used as a working language; English proficiency meets CEFR C1, with IELTS overall 7.0 (Listening 8.5, Reading 6.5, Writing 6.0, Speaking 6.0).

Education

Applied statistics and analysis foundation

Anhui University

B.S. in Applied Statistics

2023 - 2027

Relevant coursework

Time Series AnalysisApplied Stochastic ProcessesData StructuresStatistical Forecasting and Decision MakingMultivariate Statistical Analysis
  • Achieved strong performance in Java Programming, Data Structures, and mathematics-related courses.
  • Completed university-partnered AI and data science programs covering machine learning, data mining, and deep learning applications.
  • Participated in a summer research program at the National University of Singapore.
  • Co-first author of one EI-indexed ICCGV conference paper.

Experience

Enterprise data development and research data experience

Work spanning IT tool mapping, documentation workflows, large-model data support, annotation platforms, and research data workflows.

May 2026 - Present

Data Development Intern

Volkswagen (China) Technology Co., Ltd.

  • Use English as one of the working languages for cross-department IT tool, permission, and project documentation communication.
  • Mapped IT tools used across Powerhouse departments, including tool connections, access methods, and role-based permissions.
  • Identified project documents and the programs used to create them, forming a prioritized overview of IT tools and documentation workflows.
  • Analyzed gaps and redundancies across tools and documents, then proposed integration solutions for smoother workflows and fewer duplicated tools.
  • Supported IT coordination and tool-access applications for platforms such as Polarion, Feishu, and CATIA; prepared user instructions for tool requests and role management.

Oct 2025 - Apr 2026

Research Assistant

Institute of Information Engineering, Chinese Academy of Sciences

  • Supported end-to-end research data workflows, including data acquisition, cleaning, desensitization, structuring, quality control, and version management.
  • Contributed to local deployment and Chinese localization of Label Studio for secure in-house data annotation and AI research workflows.

Jul 2024 - Aug 2024

Data Analysis Intern

iFlytek Co., Ltd.

  • Provided full-process data support for large-model workflows, including data analysis, selection, cleaning, and reuse in model development.
  • Managed procurement-related data and assisted with data organization for internal business and AI-related tasks.

Dec 2023 - Feb 2024

Data Processing Intern

iFlytek Co., Ltd.

  • Processed and cleaned large-model-generated data, helping prepare usable datasets for subsequent analysis and model input.
  • Assisted with data selection, cleaning, and procurement-related information management.

Projects

Data engineering, annotation platform, and model evaluation projects

Data Engineer

Million-Scale Medical Database End-to-End Construction

Dec 2025 - Present

  • Built a high-quality medical database from scratch for precision-medicine research, covering data collection, cleaning, desensitization, structuring, version control, and quality assurance.
  • Integrated heterogeneous data such as electronic medical records and laboratory-test records; designed standardized field-mapping rules and quality-control indicators for over one million sensitive medical records.
  • Introduced automated validation and missing-value imputation strategies, raising data completeness to 96%+ and annotation consistency to 94%+.
  • Provided a reliable data foundation for clinical analysis and AI model training.

Platform Lead

Label Studio Private Deployment and Chinese Localization

Oct 2025 - Present

  • Led private deployment of Label Studio in an internal research environment using Docker, enabling stable offline service deployment and keeping data within the secure domain.
  • Completed full-stack Chinese localization across front-end interfaces, prompts, error messages, and admin modules, achieving over 98% localization coverage.
  • Conducted functional testing, multi-browser compatibility checks, and Chinese display optimization; fixed timezone, encoding, and font-rendering issues.
  • Wrote deployment and annotation user guides and built a manual plus multi-model pre-annotation workflow related to one SCI paper under review.

Researcher, National University of Singapore

Summer Research Program

Summer 2025

  • Deployed multiple large models locally, including Flamingo-3B, Flamingo-9B, Qwen-13B, and ChatGPT-Mini, and used them for visual question answering tasks.
  • Developed scripts to evaluate generated VQA outputs across multiple dimensions.
  • Contributed to a team paper based on the research results.

Skills

Skill set for data analysis and data development

Programming & Data Processing

PythonSQLJava (basic)Data cleaningStructured data processingData quality validationMissing-value handlingField mapping

Database & Data Engineering

Million-scale data processingDatabase constructionData ingestionVersion managementQuality-control workflow design

AI & Machine Learning Tools

Label StudioDockerMachine-learning workflowsLarge-model data supportVQAVQA evaluationOpen-source platform deployment and localization

Statistics & Analysis

Time Series AnalysisApplied Stochastic ProcessesMultivariate Statistical AnalysisStatistical Forecasting and Decision-Making

Workflow & Documentation

IT tool mappingAccess and role management documentationTechnical documentationTechnical manualsAnnotation guidesCross-browser testing

Resume

Download the public resume

Downloads are generated from public-facing content and do not link to the original source PDFs.

Public resume

Includes education, experience, projects, skills, and public email; omits phone number, age, and gender.

Contact

Reach out through the public email

Only contact details suitable for public publishing are shown.

Preferred city

Hefei