Features
Description
InfoShare Academy is a leading IT academy offering comprehensive educational programs in new technologies for companies. Since 2015, we have supported organizations in developing technology teams through dedicated courses in Machine Learning, DevOps, Data Engineering, Python, UX/UI Design, AWS, and Kubernetes. Our training is based on practical skills and real business cases. We collaborate with over 300 industry practitioners, ensuring that our programs are tailored to current market needs. We specialize in reskilling and upskilling employees. With us, you will build effective teams implementing new technologies that will accelerate innovation and strengthen your company's competitiveness in the market. Check out our training offerings designed for companies, created to enhance your employees' competencies in the IT field.
Azure Databricks is a big data service based on the Apache Spark platform that enables the creation, learning, and exploration of data in the cloud. It is a data processing platform that provides scalability, performance, and ease of use. Azure Databricks allows teams to coordinate work and share code more easily.
- For individuals who want to use data to optimize processes.
- For those who want to better understand Apache Spark.
- For individuals with basic knowledge of data analysis.
- For programmers, Data Engineers, and Data Scientists.
- You will learn the fundamentals of the Azure Databricks platform.
- You will learn data processing and preparation.
- You will learn how to analyze data with Databricks SQL.
- You will learn to use Apache Spark.
What is the Databricks Lakehouse Platform
Describe what the Databricks Lakehouse Platform is
Explain the origin of the Lakehouse data management paradigm
Outline fundamental challenges related to managing and using data
Describe security features of the Databricks Lakehouse Platform
Give examples of organizations that have benefited from using the Databricks Lakehouse Platform
What is Databricks SQL
Summarize fundamental concepts for using Databricks SQL effectively
Identify tools and features in Databricks SQL for querying data and sharing insights
Explain how Databricks SQL supports data analysis workflows that allow users to extract and share business insights
What is Databricks Machine Learning
Describe the basic overview of Databricks Machine Learning
Identify how using Databricks Machine Learning benefits data science and machine learning teams
Summarize the fundamental components and functionalities of Databricks Machine Learning
Exemplify successful use cases of Databricks Machine Learning by real Databricks customers
What is Databricks Data Science and Data Engineering Workspace
Describe the basic overview of Databricks Data Science and Engineering Workspace
Identify assets provided by the workspace
Describe a simple development workflow that queries and aggregates data
Databricks Workspaces and Services
Databricks architecture and services
Data Science and Engineering Workspace
Create and manage interactive clusters
Notebook basics
Git versioning with Databricks Repos
Using Databricks Repos
Getting started with the Databricks Platform
Delta Lakehouse
What is Delta Lake
Managing Delta Tables
Manipulating tables with Delta Lake
Advanced Delta
Relational Entities on Databricks
Databases and Views
Views and CTEs
ETL with Spark SQL
Query files directly
Providing options
Creating Delta Tables
Writing to tables
Cleaning data
Advanced SQL transformations
UDF
Getting Started with Databricks SQL
Getting started with Databricks SQL
Navigating Databricks SQL
Unity Catalog on Databricks SQL
Schemas, tables and views on Databricks SQL
Basic SQL on Databricks SQL
Ingesting data for Databricks SQL
Ingesting data
Joins
Delta commands in Databricks SQL
Presenting Data Visually
Data visualization
Data visualizations on Databricks SQL
Dashboards on Databricks SQL
Notifying stakeholders
Apache Spark Programming – DataFrames
Databricks platform
Databricks ecosystem
Spark SQL
DataFrames
SparkSession
Reader and writer
Data sources
DataFrame and column
Column and expression
Transformations, actions and rows
Apache Spark Programming – Transformations
Aggregation
Aggregation functions
Datetimes
Dates and timestamps
Complex types
Additional functions
UDFs
UDFs vectorized functions
Apache Spark Programming – Spark Internals
Spark architecture
Spark cluster, Spark execution
Shuffling and caching
Query optimization
Partitioning
Apache Spark Programming – Structured Streaming
Apache Spark programming
Streaming concepts
24 h/3 days
- Certificate of completion
- Monthly access to training recordings (for online format)
- Customization of the training program to client needs