Job Summary
We are seeking a Data Generation and Annotation Specialist to join our dynamic team within the Data Science department. The ideal candidate will be responsible for generating and annotating high-quality datasets that will be used to train and evaluate machine learning models. The role requires meticulous attention to detail, an understanding of data ethics, and the ability to work with complex data sets in a variety of formats.
Responsibilities
- Generate synthetic data that closely mimics real-world scenarios for training machine learning models.
- Annotate and label large datasets with accurate tags for various features and classes to aid in model training and testing.
- Review and validate data annotations to ensure consistency and accuracy across the dataset.
- Collaborate with data scientists and machine learning engineers to understand data requirements and deliver datasets that meet specific model needs.
- Utilize annotation tools and software, adhering to project-specific guidelines and protocols.
- Participate in the development and refinement of annotation guidelines and quality control procedures.
- Identify and report any issues that may affect data quality or integrity.
- Stay updated with the latest trends and technologies in data annotation and machine learning datasets.
Requirements
- Bachelor’s degree in Liguistics, Business administration computer scientist, or a related field
- Proven experience in data annotation, data labelling, or a related role.
- Familiarity with machine learning concepts and the importance of high-quality data in AI.
- Strong attention to detail and a commitment to achieving high levels of data accuracy.
- Ability to work independently and collaboratively in a team environment.
- Excellent time management skills and the ability to handle multiple tasks simultaneously.
- Proficient in using data annotation tools and platforms.
- Strong problem-solving skills and the ability to think critically.
- Effective communication skills, both written and verbal.
Nice-to-have
- Experience with scripting languages such as Python for automation of data processing tasks.
- Knowledge of data privacy laws and ethical considerations in data generation.
- Experience working with diverse datasets including text, images, and audio.
What we offer for your valuable work
- You’ll join a collaborative and motivated team where everyone, including you, is active in the product definition and development process.
- We provide quality collective insurance to all employees and their families, including dental care.
- We support 50% of expenses related to sports (gym memberships, sportswear, etc), up to 250 dollars a year.
- We refund 50% of public transportation expenses, up to 50$ a month.
- We offer a flexible hybrid schedule for employees wishing to work sometimes from home after a training period (usually 3 months).
- We offer Stock Options, depending on performance, and after an approbation period.