Data Scientist - Vietnam
We are a digitally native company that helps organizations reinvent themselves and unleash their potential. We are the place where innovation, design, and engineering meet scale.
Globant is a 20-year-old NYSE-listed public organization with more than 27000 employees worldwide working out of 25 countries globally.
www.globant.com
Our expertise in Data & AI allows our studio to create a wide variety of end-to-end solutions for industries including finance, travel, media & entertainment, retail, and health, among others. We democratize data and foster organizational changes towards a data-driven culture.
Location: Vietnam (HN/DN)
Experience: 4+ years
YOU WILL GET THE CHANCE TO:
As a data scientist, you will be domain agnostic; hence, you'll be able to work in different domains like finance, pharmaceutical, media and entertainment, manufacturing, hospitality, and so on. You will get the exposure to the following:
- Get an opportunity to work on different projects and deal with problems like anomaly detection, time series forecasting, building bots, classification, regression, clustering, recommendations A/B Testing, etc.
- Contribute to innovative accelerators and develop tools, methodologies, and frameworks that can be used to accelerate the development of data science solutions.
- Build and maintain machine learning pipelines that support model development, training, deployment, and monitoring, ensuring that they meet the highest standards for quality and reliability.
- Hands-on experience on the Google Cloud platform which includes exposure to BigQuery, Vertex AI, Kubeflow, and Generative AI modules of GCP
- Design and develop multiple POCs/POVs for existing customers or prospective leads. Preserving the knowledge through this research and innovation and utilizing it to enhance the overall capability of the Artificial Intelligence Studio.
- Proactively interacts with the client and takes important technical decisions regarding design and architecture. Establish and maintain relationships with clients, acting as a trusted advisor and identifying opportunities for new or expanded business.
- Agree on scope, priorities, and deadlines with the project managers.
- Describe problems, provide solutions, and communicate clearly and accurately.
- Assure the overall technical quality of the solution.
- Estimate the time of development tasks and perform difficult/critical coding tasks.
- Defining metrics and setting objectives in multiple complex tasks.
WHAT WILL HELP YOU SUCCEED?
A strong base in statistics along with machine learning is mandatory. Strong experience in machine learning, computer vision, and Gen AI techniques. You should be well-versed in techniques like Object Detection, Face recognition, and Object Tracking, which can help you improve your models. Knowledge of the model life cycle to continuously improve the model after production deployment is essential. Always staying relevant with the latest developments in the field of data science and advocating it to the other Data Scientists within the organization. An expert professional with a lot of zeal to learn and explore new methodologies. You must work in a collaborative environment and come up with innovative ideas to continuously improve the solutions/models. Candidates must be willing to explore and research newer areas/technology/algorithms and look to continuously improve the models.
CORE TECHNICAL SKILLS
- Linear Algebra, Statistics, and strong in Python
- Exploratory Data Analysis and Data Visualization. This includes using statistical and visualization libraries like numpy, pandas scipy, matplotlib, or software like PowerBI, and Tableau. It is also good to have exposure to statistical analysis packages like SAS and SPSS
- Must have expertise in Hypothesis testing or A/B Testing. Exposure to multi-arm bandit-based testing will be an added advantage.
- Supervised and Unsupervised ML using statistical and deep learning techniques. Candidates should be familiar with parametric and non-parametric machine learning methods. They should be able to define a sensible evaluation metric for the machine learning models. Exposure to automated machine learning is an added advantage.
- Exposure to advanced machine learning algorithms will be an added advantage. This can include reinforcement learning techniques, genetic algorithms, semi-supervised learning methods, etc.
- Candidates should be able to extract data from multiple data sources. This includes, and is not limited to SQL, NoSQL, and Graph databases. Familiarity with SQL and No-SQL query languages is essential.
- Data structures and algorithms including space and time complexity requirements.
- Candidates are expected to have expertise in statistical methods including ANOVA multivariate regression, exploratory and confirmatory factor analysis, multidimensional scaling and cluster analysis.
- Exposure to model deployment lifecycle. This includes exporting the model, using REST services to expose the model, defining KPI to continuously monitor model performance, model re-training, and transfer learning and basic knowledge of MLOps
- Experience on GCP with exposure to Big Query, Vertex AI, Kubeflow, and Generative AI modules.
- Should keep up to date with the latest trends including Generative AI
Computer Vision (If applicable)
- Must have experience in common computer vision use cases like face recognition, image classification, pose estimation, medical image analysis, edge detection, etc
- Should have experience in image preprocessing concepts like image filtering, noise reduction, image masking, image segmentation, albumentations, etc
- CNN and CNN architectures like AlexNNet/Google Net and ResNet
- Should be well-versed with Object Detection, Segmentation, and Tracking Algorithms
- Implement and fine-tune, pre-trained models such as YOLO, Mask R-CNN, Segment
- Anything Model(SAM), Multi-modal models like DALL-E, Gemini, and other state-of-the-art image segmentation and detection models.
- Should be well versed with commonly used models like CNN, GAN, and libraries like PyTorch, OpenCV,, and Matplotlib while working on image and video analysis.
- Strong knowledge and hands-on experience in computer vision, object detection, and segmentation algorithms.
- Proficiency in Python for developing and training machine learning models. Experience with TensorFlow, PyTorch, YOLO, or similar frameworks is required
Gen AI and LLMs
- Must understand the working of core concepts like attention mechanisms, Encoders, Decoders, and Transformer models like BERT Exposure to frequently used models like Attention-based RNN, LSTM, BERT, and GPT
- Exposure to LLMs like OpenAI models/Gemini/LLama/Claude Models and Multimodals
- Experience in Prompt designing and versioning
- LLM frameworks like Langchain, Llama index
- Vector Embeddings and Databases like Chroma/Weaviate/Single Store
- Model Evaluation, Debugging, and Basic LLMOps concepts(Deployments and Monitoring)
- Should be able to use open source services and build custom models using transfer learning.

