We’re looking for data scientists who turn ambiguous business questions into models and analyses that actually get used. You’ll frame the problem, wrangle the data, build and validate models, and — just as importantly — explain what the numbers mean to people who don’t speak Python. Your day-to-day will revolve around the classic Python data stack: pandas, NumPy, and scikit-learn. You’ll also work on modern enterprise data platforms — running notebooks and lakehouse workloads in Microsoft Fabric, building pipelines and operational applications in Palantir Foundry, and deploying analytics on Azure, AWS, or GCP. The science matters, but so does making it work inside a real organization’s data ecosystem. What We Offer AI Grant — Stop talking about AI and start building it. Our AI Grant gives you dedicated budget and resources to turn your wildest AI idea into a working project, backed by two paid weeks to focus on nothing else. AI Center of Excellence — Work alongside specialists in agentic AI, sovereign AI, generative and discriminative AI. This isn’t a siloed team — it’s the people you’ll learn from and build with daily. Your tools, your choice — Full access to AI-powered development tools including Claude, Cursor, and GitHub Copilot. Pick what works best for you. Real project variety — From generative AI for legal document compliance through agentic systems in manufacturing environments to enterprise-scale AI platforms, computer vision, and autonomous driving. You won’t get bored. Conference and speaking support — Want to attend conferences? We’ll back you. Want to speak at them? Even better — we’ll support you with dedicated preparation time and bonuses. Your tasks - Translate business problems into analytical solutions: define hypotheses, choose metrics, and select the right modeling approach for the question at hand - Explore, clean, and prepare data using Python (pandas, NumPy), working with structured and semi-structured sources of varying quality - Build, validate, and tune machine learning models for classification, regression, forecasting, segmentation, and recommendation using scikit-learn, XGBoost, and statsmodels - Design and analyze experiments: A/B tests, statistical hypothesis testing, and causal analysis that hold up to scrutiny - Work hands-on with enterprise data platforms such as Microsoft Fabric (lakehouses, notebooks, semantic models) and Palantir Foundry (pipelines, ontology, operational workflows) - Communicate findings through clear visualizations, dashboards, and narratives tailored to technical and non-technical stakeholders alike - Collaborate with data engineers on data availability and quality, and with ML/AI engineers to move models from notebook to production - Monitor deployed models for drift and degradation, and own the retraining and improvement cycle Requirements - At least 4 years in data science or applied analytics, with models and analyses that made it past the prototype stage - Strong Python skills across the standard data stack (pandas, NumPy, scikit-learn) - Solid statistical foundations: hypothesis testing, regression, experimental design, and knowing when a result is real versus noise - Knowledge of at least one cloud/data platform and its data science components — Azure (Microsoft Fabric, Azure Machine Learning), AWS (SageMaker), GCP (Vertex AI, BigQuery ML), Snowflake (Snowpark ML, Cortex), Databricks (MLflow, Mosaic AI), or Palantir (Foundry Code Workspaces, Foundry ML, AIP) - Ability to communicate analytical results clearly to business stakeholders and influence decisions with data - Familiarity with working autonomously while collaborating effectively with data engineers, architects, and product teams - Fluent English, both written and spoken - Fluent Polish required - Residing in Poland required Nice to have - Production experience with Microsoft Fabric or Palantir Foundry certification (Foundry Data Engineer / Data Scientist tracks) - Experience with time series forecasting, NLP, recommender systems, or exposure to LLM-based workflows - Familiarity with MLOps practices: MLflow, experiment tracking, model versioning, and CI/CD for ML - Experience with Power BI or other BI tools, and distributed processing with Spark/PySpark Job no. 260612-N20H3 Sii ensures that all hiring decisions are made solely on the basis of qualifications and competence. We are committed to equal and fair treatment of all, regardless of legally protected characteristics. At Sii, we promote a diverse and inclusive work environment, in full compliance with applicable anti-discrimination laws. Benefits For You Great Place to Work Solid financial situation Contracts with the biggest brands Centre of internal trainings Many experts you can learn from Open and accessible management team Profit sharing Passion Sponsorship program Regular integration events and trips Comfortable and well-equipped offices MySii app Medical care

Data Scientist – Python & Cloud Data Platforms (f/m/x)

O roli

Obowiązki

Wymagania

Mile widziane