Skip to content
View mayin0902's full-sized avatar
  • 南京倧学

Block or report mayin0902

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
mayin0902/README.md

Hi, I'm Mayin πŸ‘‹

I am currently pursuing my M.S. degree at Nanjing University, with research and project experience in computer vision, multimodal AIGC, LLM fine-tuning, and AI for Science.

My recent work focuses on building practical AI systems, including e-commerce image generation, community rule violation detection, UAV image regression, medical image classification, and molecular machine learning.


About Me

  • πŸŽ“ M.S. student at Nanjing University, AI for Science direction
  • πŸ… Kaggle Expert with 2 Silver Medals and 1 Bronze Medal
  • πŸ”¬ Research experience in equivariant graph neural networks and machine-learning potentials
  • πŸ€– Project experience in LLM fine-tuning, computer vision, AIGC image generation, and multimodal systems
  • πŸ› οΈ Interested in building reliable AI systems that combine algorithms, engineering, and real-world applications

Selected Highlights

  • Kaggle Expert: 2 Silver Medals, 1 Bronze Medal, global ranking Top 1.18%
  • Google Jigsaw Rules Classification: Kaggle Silver Medal, ranking 42 / 2445
  • CSIRO Image2Biomass Prediction: Kaggle Silver Medal, ranking 83 / 3803
  • Jittor Medical Image Classification: Finalist in the Jittor Algorithm Challenge
  • E-commerce AIGC Try-on System: built an end-to-end virtual try-on and product image generation pipeline
  • FiLM-Ξ”-PaiNN: first-author research on equivariant graph neural networks for molecular potential prediction

Selected Projects

E-commerce AIGC Try-on System

Built an end-to-end virtual try-on and product image generation prototype for e-commerce scenarios.

Tech Stack: FLUX.1 Fill, Gemini Image Generation, SigLIP, LoRA, ComfyUI, FastAPI, LangGraph

Main Contributions:

  • Designed a dual-route generation pipeline combining local controllable diffusion models and high-level image generation models
  • Used SigLIP-based visual condition injection to preserve garment texture and visual details
  • Applied LoRA-based identity customization to improve face consistency in generated product images
  • Built an automated workflow with FastAPI, ComfyUI, and LangGraph for generation routing, quality checking, and retry control
  • Explored an algorithmic solution to reduce product image production time from traditional shooting workflows to rapid AI-assisted generation

Google Jigsaw Rules Classification

Kaggle Silver Medal, ranking 42 / 2445.

This project focused on community rule violation detection, where the model needed to judge whether a post violated a specific community rule.

Tech Stack: Qwen3-4B, QLoRA, DeepSpeed ZeRO-2, vLLM, GTE, Triplet Loss, DeBERTa-v3, FGM, EMA, Rank Blending

Main Contributions:

  • Reformulated the task as rule-conditioned violation detection instead of simple rule ID classification
  • Fine-tuned Qwen3-4B with QLoRA under limited GPU resources
  • Used DeepSpeed ZeRO-2 and bf16 mixed precision to improve training efficiency
  • Built a GTE-based metric learning branch to improve generalization to unseen rules
  • Built a DeBERTa-v3 discriminative classifier with FGM adversarial training and EMA
  • Combined LLM, metric-learning, and discriminative branches through rank-based blending
  • Improved final AUC to 0.929

CSIRO Image2Biomass Prediction

Kaggle Silver Medal, ranking 83 / 3803.

This project focused on multi-target biomass prediction from UAV top-view grassland images under a small-sample setting.

Tech Stack: PyTorch, timm, DINOv3 ViT-Huge, SigLIP, LightGBM, CatBoost, PCA, GMM, AMP

Main Contributions:

  • Designed a DINOv3-based dual-view image feature extraction pipeline for 2000Γ—1000 UAV images
  • Split each image into left and right views and encoded them with shared-weight visual backbones
  • Developed a Gated Local Token Mixer to improve cross-view token interaction and local texture modeling
  • Used structured multi-head outputs to explicitly predict Green, Dead, and Clover biomass, then reconstruct GDM and Total biomass
  • Added a SigLIP + LightGBM / CatBoost semantic compensation branch
  • Improved weighted RΒ² from baseline 0.54 to 0.63

Medical Image Fine-grained Classification

Finalist in the Jittor Algorithm Challenge, B leaderboard ranking 7th.

This project focused on fine-grained BI-RADS classification from breast ultrasound images.

Tech Stack: Jittor, EfficientNetV2-S, Multi-Dropout, BN-Neck, 5-Fold CV, TTA, AMP, Albumentations, OpenCV

Main Contributions:

  • Designed an EfficientNetV2-based fine-grained medical image classification model
  • Combined multi-scale features from intermediate and final backbone layers
  • Used BN-Neck normalization to stabilize feature representation
  • Built a Multi-Dropout classification head with multiple dropout rates and averaged logits
  • Used 5-fold cross validation, test-time augmentation, checkpoint ensemble, and mixed-precision training
  • Improved robustness under severe class imbalance, especially for high-risk categories with very limited samples

FiLM-Ξ”-PaiNN: Equivariant GNN for Molecular Potential Prediction

First-author research project on molecular machine learning and AI for Science.

Tech Stack: PaiNN, FiLM, Ξ”-learning, PyTorch, ASE, DeepMD, equivariant GNNs

Main Contributions:

  • Designed an equivariant graph neural network for high-accuracy molecular potential prediction
  • Combined PaiNN-style equivariant message passing with FiLM-based physical information modulation
  • Used Ξ”-learning to reduce the learning difficulty between low-fidelity and high-fidelity energy labels
  • Evaluated the model on molecular and periodic benchmark datasets
  • Achieved significant error reduction compared with direct-learning baselines and several mainstream equivariant models

Hydrogen Behavior at Mineral Interfaces

Research project on quantum-accuracy simulation of hydrogen behavior at mineral interfaces.

Tech Stack: VASP, DeepMD, DPA, LAMMPS, ASE, Python

Main Contributions:

  • Built a closed-loop workflow from DFT calculations to machine-learning potential training and molecular dynamics simulation
  • Generated and curated more than 40k atomic structures for model training
  • Trained machine-learning potentials for hydrogen-water-mineral interface systems
  • Used LAMMPS and DeepMD to study interfacial adsorption, diffusion, and structural stability
  • Improved simulation efficiency while maintaining near-DFT-level energy and force accuracy

Technical Skills

Programming and Engineering

  • Python, C/C++
  • PyTorch, Jittor, scikit-learn, timm
  • FastAPI, LangGraph, ComfyUI
  • OpenCV, Albumentations
  • Linux, Git, VS Code Remote SSH

Computer Vision and Multimodal Learning

  • DINOv3
  • SigLIP
  • EfficientNetV2
  • Vision Transformer
  • Image classification
  • Visual regression
  • Multi-scale feature fusion
  • Test-time augmentation and model ensemble

Large Language Models

  • Transformer, BERT, GPT
  • Qwen
  • QLoRA
  • DeepSpeed
  • vLLM
  • DeBERTa
  • GTE embedding models
  • Rank blending and ensemble learning

AIGC and Image Generation

  • Diffusion models
  • FLUX.1 Fill
  • Gemini image generation
  • LoRA fine-tuning
  • Virtual try-on
  • Image editing workflow design
  • Generation quality control and automatic retry

AI for Science

  • Molecular machine learning
  • Equivariant graph neural networks
  • PaiNN
  • DeepMD
  • DPA
  • Ξ”-learning
  • Molecular dynamics simulation
  • DFT-to-ML potential workflows

Current Interests

I am currently interested in:

  • Multimodal AIGC systems for e-commerce and content generation
  • LLM fine-tuning and efficient inference
  • Computer vision models for small-sample and fine-grained recognition
  • Reliable AI workflows with automatic evaluation and retry mechanisms
  • Equivariant graph neural networks and machine-learning potentials for molecular simulation

Contact

Pinned Loading

  1. csiro-image2biomass-silver csiro-image2biomass-silver Public

    Kaggle Silver Medal solution for CSIRO Image2Biomass Prediction with DINOv3 dual-view regression and GBDT ensemble.

    Python 1

  2. google-jigsaw-agile-community-rules-silver google-jigsaw-agile-community-rules-silver Public

    Kaggle Silver Medal solution for Google Jigsaw Agile Community Rules Classification: Qwen3-4B + QLoRA, GTE, DeBERTa, Rank Blending.

    Python 1

  3. neurips-open-polymer-prediction-bronze neurips-open-polymer-prediction-bronze Public

    Kaggle Bronze Medal solution for NeurIPS Open Polymer Prediction 2025 with molecular descriptors, GNN, CatBoost/XGBoost ensemble.

    Python 1

  4. 7th-jittor-cainiao-ultrasound-images 7th-jittor-cainiao-ultrasound-images Public

    National Rank 7 solution for Jittor AI Algorithm Challenge Track 1: breast ultrasound classification with EfficientNetV2-S and multi-dropout.

    Python 1

  5. composed-video-retrieval composed-video-retrieval Public

    Forked from OmkarThawakar/composed-video-retrieval

    Composed Video Retrieval

    Python

  6. mayin0902 mayin0902 Public