Overview
Description
Data Scientist (m/f/d) with Python and PySpark on a Databricks Tech-Stack
Project location: Remote
– Advice and designing business-critical data science use cases, from the business problem to delivery and operation.
– Statistical analysis and exploration of static, mixed, and time-series data.
– Writing of exploration and production code in Python and PySpark on a Databricks Tech-Stack.
– Design and implementation of ML algorithms related to failure predictions.
– Improvement and optimization of existing ML algorithms using various tuning techniques.
– Study of new features, feature importances, correlations, causations, data leakage, up- and downsampling.
– Advice data engineers in writing production code for feature engineering
Project skills:
?
Good communicator, but great at independent work. Solution oriented. Analytical thinking
Experience in Data Science, machine learning, statistical modelling, multivariate analysis, exploratory data analysis, and software development + data engineering. Optimally, candidate should have a degree in mathematics, physics, computer sciences or in a related field. Several years of experience in Python programming are required, especially in working with PySpark, Scikit-Learn, XGBoost, MLflow, Matplotlib and related libraries. Good familarity with Azure, Git, Gitlab, CI/CD, Docker and Databricks are beneficial. Candidate should be familiar or inclined to working in an agile environment. Prior experience with predictive maintenance tasks is a plus
About ZeilenJOB
Portal für Remote Jobs