Return to flip book view

FABRIC NOTES

Page 1

Message FABRICHANDBOOKby Eugene MazarakisFocused on certification exam

Page 2

Table ofContents01010202030304040505About The WriterFabric ExaminationFabricPower BISQL

Page 3

About TheWriter“Anythingworth havingtakes time.“Meet Eugene Mazarakis, a senior Data Engineer with a wealth ofexperience across diverse industries, including energy, banking andtelecommunications. Over the course of numerous challengingprojects, Eugene has honed his expertise in designing, implementing,and optimizing data solutions tailored to complex business needs.He combines technical prowess with a deep understanding ofindustry demands, making him a trusted resource for transformingcomplex concepts into actionable solutions.He holds a Master of Science degree in Theoretical Computer Sciencefrom the National and Kapodistrian University of Athens.0101

Page 4

FABRIC EXAMINATIONYou can take the examination online, whichconsists of approximately 60 questions, includingmultiple-choice questions and use cases. You will have 1 hour and 40 minutes to completeit. If you are unsure about an answer, you canmark the question and review it later. The passing score for the examination is 700.0202

Page 5

FABRICAdmin Portal DatastoresDeploymentSolutionToolsComparison01 0203 04Some important settings that you need to know.What are the datastores,when do you need them, and what do they contain.How can you deploy yoursolution from developmentto testing and then to theproduction system.A comparison of the toolsyou need to use whenworking with the Fabricservice.0303

Page 6

Analytics’ TypesDescriptiveDescriptiveThe data you examinereveals what happened.DiagnosticDiagnosticPredictivePredictivePrescriptivePrescriptiveThe data you examinereveals why did ithappen something.The data you examinereveals what will happenin the future.Examine actions thatneed to happen.

Page 7

Fabric Admin PortalPreview featuresSecurity featuresGit integrationTenantSettingsCapacitySettingsCreate, delete capacitiesManage capacity permissionsChanges the size of capacity(redirect to azure portal)

Page 8

LakehouseWarehouseIt’s presented as a databaseabove the Data Lake anduses Delta Format tables.It’s traditional version of aDWH, data is stored in aDelta format and can bequeried through SQL.Fabric Data StoresKQLdatabaseIt’s used to analyze hugeamount of data for Real-Time intelligence.

Page 9

StructuredSemi-StructuredUnstructuredReal-TimeLakehouseDWHKQL db Data Types

Page 10

Data Ingestion (ETL / ELT)ExternalIngest Data Into Fabric ShortcutsDatabase MirroringSnowflakeCosmos DBAzure SQLDataflowGen2Data PipelineNotebookEvent StreamMax number of refreshes per day are 48.InternalLakehouse tables/filesWarehouse tablesKQL tablesADLS Gen2Amazon S3Dataverse

Page 11

Query Data by using SQLSQL Analytics EndpointIt applies only to read statements, specifically SELECT queries.Fabric DWHFor read and write statements.Query data through XMLA endpointNavigate to the workspace settings, obtain thelink, and add it into SSMS. From there, you canaccess Lakehouse and DWH.

Page 12

LakehouseExtra PermissionsReal all SQL endpoint data: access to all data of theLakehouse, through SQL Endpoint.Read all Apache Spark: access to underlying data files,through Apache Spark (notebook).Build reports on the default semantic model: use onlythe default model.Delta Table Maintenance Optimize command: It combines many small files andmakes them into a few large files.Vacuum command: Delete the files.

Page 13

WarehouseYou can monitor the performance with the following 3 ways:Capacity Metrics AppIt is a power bi app, which is installed in fabric and you canmonitor performance in the warehouse.It shows resource intensive workloads. Dynamic Management Views (DMVs)They are systemic views that they give information about thewarehouse.sys.dm_exec_connections: Info for the connectionbetween warehouse and the engine.sys.dm_exec_sessions: Info for every session.sys.dm_exec_requests: Info for active request inside asession.Query InsightsSome of the views that contains are: exec_request_history: Info for the completed SQL query.frequently_run_queries: Info for the frequently runqueries.long_running_queries: Info for the execution time of thequeries.

Page 14

DeploymentSolutionAs you create it, you also define how many levels you want.Usually they are 3 (Dev, Test, Prod). Each of the 3 levels,assigned to a Workspace.Dev: Here you make the changes.Test: Sharing this so that tests can be done.Prod: Τhis is used by the customer. Dev Test ProdDeployDeploy

Page 15

Tools ComparisonDAX StudioTabular EditorWrite, execute, debug DAXqueries.Edit data models and createmeasures, calculation groups,perspectives.Usage of Vertipaq Analyzer inorder to see the size of thesemantic model, the size ofcolumns and tables.Create iterative processes throughscripting.Analysis of data you export fromthe PBi desktop performanceAnalyzer.Usage of the Best Practice Analyzer inorder to locate common issues (i.e.Surrogate column with the'Summarize by' property set to avalue other than 'None').Perform trace analysis.Create Object Level Security.Disable implicit measures.Bootstrap the initial full load for theincremental refresh.

Page 16

POWER BIWorkspace Level SharingTypes ofFiles & Visuals Power Query Data Profiling ToolIncremental Refresh01 0203 04Information about rolesassigned to a workspace.Types of visuals and files andtheir respective use.Give insights about the data in a column.Technique to load only thenewly changed records of a table.0404

Page 17

Workspace Level SharingRoleActionsAdminUpdate, delete workspace. Only this role can define another admin.MemberAdd new user to a workspace, with <=permission.ContributorIt can see content and can edit it. ViewerIt can see content, but it cannot edit it.Access can be granted to individuals or groups, throughone of the following roles. Τhe roles are applied to theworkspace. Α user, depending on the role, can alsoperform the corresponding actions.

Page 18

Types of Files Ιf you save the file as .pbip, it will create 2 folders:One folder for the report metadata.One folder for the semantic model metadata.With this file type, you can use the GIT control in order to manage the report and the model.PBIP : power bi project filePBIT: power bi template filePBIDS: power bi data source fileRe-usable asset for the creation of power bi reports.Re-usable asset to transfer data connections betweenreports.

Page 19

Types of VisualsVisualUse CaseTable & MatrixPresent aggregate data.Bar & ColumnPresent 1 categoric & 1 numeric variable.Line & AreaPresent time-series data.CardPresent KPI metric.Pie, Donut, TreeMapPresent ratios.Combo ChartsPresent more than one measure on Y-axis.FunnelPresent movement along a linear process.GaugeTo present progress towards a goal.WaterfallTo present running total over a period.ScatterPresent relationship between 2 numericalvariables.Q & AAsk questions about the data.

Page 20

Power Query Data Profiling ToolFor each column in the query, it returns the following 3 percentages:Valid: % of valid values in the column.Errors: % of errors in the column.Empty: % of empty values in the column.By default, it occurs on the first 1,000 records, but it can also be applied to all data.Column QualityColumn Value DistributionFor each column in the query, it returns the distribution of values within that column. It also returns:Distinct: The distinct values that appear in a column.Unique: The values that appear only once in a column.Column ProfileFor each column in the query, it returns statistical metrics and the value distribution. Some of the metrics are:MIN: the minimum value of the column.MAX: the maximum value of the column.STD: the standard deviation.

Page 21

Incremental RefreshIncremental refresh provide an efficient way to handledynamic data and improve model refresh performance. It reduces the amount of data that needs to be refreshed andallows for the inclusion of real-time data. It is applicable to Fact tables in the semantic model, and ituses the following 2 parameters (names are case sensitive):RangeStart: represents the oldest, or earliest date/time.RangeEnd: represents the newest, or latest date/time.

Page 22

SQLSCDSlowly Changing DimensionsFunctions0102Framework for updating andmaintaining data stored indimension tables.Some useful functions, thatyou need to know.0505

Page 23

Slowly Changing DimensionsTypesNotes0Nothing changes1Overwrite data. No historicity on the table.2Maintains historicity. Adding a new record and alsoneeds some columns to be defined, such as:Valid from: When does the new value start to apply?Valid to: When does the new value stop to apply?IsCurrent: Optional flag (0/1). Which row is the latest.

Page 24

SQL FunctionsLEADLEAD (column to see, offset >0)It examines the next line and carries the valueof the specified column to the previous line.LAGLEASTGREATESTLAG (column to see, offset >0)It examines the previous line and carries thevalue of the specified column to the next line.LEAST ( column 1 [ , ...column N ] )It returns the minimum value from a list of oneor more expressions.GREATEST ( column 1 [ , ...column N ] )It returns the maximum value from a list of oneor more expressions.

Page 25

STAYUPDATEDYouTubeBlog Linked InThe 1nf0rmant emazarakis.github.ioeugene-mazarakis