Chief Data Officer (CDO)
CDO is a senior executive role is to establish firm's
data and information governance strategy, control, policy development and
effective implementation plan to successfully create business value and
showcase the benefits & ROI.
Data Program Manager
A Data Program Manager ensures that the program charter is in
line with the organizational data strategy and roadmap for the future
Data Protection Officer
“Data Protection Officer” means an individual appointed as such by a Significant Data Fiduciary under the provisions of this Act
https://avishkarm.blogspot.com/2022/11/digital-personal-data-protection-bill.html
Data
Scientist
Data
scientists perform research and tackle open-ended questions. A data scientist
has domain expertise, which helps him or her create new algorithms and models
that address questions or solve problems. The data scientist takes the data
visualizations created by data analysts a step further, sifting through the
data to identify weaknesses, trends, or opportunities for an organization. The
data scientist role is critical for organizations looking to extract insight
from information assets for “big data” initiatives and requires a broad
combination of skills that may be fulfilled better as a team.
Data Steward
Data stewardship is the management and oversight of corporate
data by designated personnel who typically don’t “own” the data but who ensure
adherence to data laws and internally established data governance policies. They act as
trustees of data, are intimately knowledgeable with business process and data
usage. Their
area of responsibility addresses issues such as data quality, accessibility,
usability, and security.
Database Architect
Data Architect gather requirements from Business and
Technology team, determine high-level design and design Business data and
rules in meaningful and consistent manner, pick the right data technology,
review database objects like tables, SPs and database design.
DBA
DBA
designs, implements, administers, and monitors data management systems and
ensures design, consistency, quality, and security. Perform data housekeeping
activities like Storage, Backup/Restore, Performance optimization
DBA- Database Administrator
DataOps
DataOps
(data operations) is an agile, process-oriented methodology for developing and
delivering analytics. DataOps (data operations) brings together DevOps
teams with data engineer and data scientist roles to provide the tools,
processes and organizational structures to support the data-focused enterprise.
DataOps focus on the collaborative development of data flows and the continuous
use of data across the organization.
Data Engineer
The
data engineer moves data from operational systems into a data lake and writes
the transforms that populate schemas in data warehouses and data marts.
Data Engineers are the individuals in an organization
responsible for setting up the data infrastructure, overseeing the data
processes, and building the data pipelines that convert raw data into
consumable data products.
Data Analyst
The
data analyst takes the data warehouses created by the data engineer and
provides analytics to stakeholders. The data analyst creates visual
representations of data to communicate information in a way that leads to
insights either on an ongoing basis or by responding to ad-hoc questions. The
data analyst serves as a gatekeeper for an organization’s data so stakeholders
can understand data and use it to make strategic business decisions. data analysts
draw conclusions from data to describe, predict, and improve business
performance. They form the core of any analytics team and tend to be
generalists versed in the methods of mathematical and statistical analysis.
Data Principal
“Data Principal” means the individual to whom the personal data relates and where such individual is a child includes the parents or lawful guardian of such a child; (7)
Data Processor
“Data Processor” means any person who processes personal data on behalf of a Data Fiduciary
Data Lakehouse
A data lakehouse is a new, open data management architecture
that combines the flexibility, cost-efficiency, and scale of data lakes with
the data management and ACID transactions of data warehouses, enabling business
intelligence (BI) and machine learning (ML) on all data.
Data Lake
A
data lake is a storage repository that holds a vast amount of raw data in its
native format, including structured, semi-structured, and unstructured data
Data Warehouse
A data warehouse is a data storage technology that brings
together data from multiple sources into a single system. It serves as a
centralized data hub holding large amounts of historical data that users can
query for the purpose of analytics.
Data Model
Data models are visual
representations of an enterprise’s data elements and the connections between
them. By helping to define and structure data in the context of relevant
business processes, models support the development of effective information
systems. They enable business and technical resources to collaboratively decide
how data will be stored, accessed, shared, updated and leveraged across an
organization.
Data hub
Data hubs are data stores that act
as an integration point in a hub-and-spoke architecture. They physically move
and integrate multi-structured data and store it in an underlying database Data mesh
A data mesh is a new approach to designing data architectures.
It takes a decentralized approach to data storage and management, having
individual business domains retain ownership over their datasets rather than
flowing all of an organization’s data into a centrally owned data lake
Data fabric
A
data fabric is an architectural design that enables connection to data
regardless of where it is stored. This makes it possible to store data in
separate “siloed” data lakes or data warehouses, each with localized control
and governance, while still allowing users to perform queries across the
entirety of an organization’s data assets. The idea of a data fabric is to
balance the pros and cons of centralized vs. decentralized data architectures,
making it possible to have strong data protection and security without
sacrificing data visibility or insights. Data fabrics work by unifying data
assets at the compute level, rather than the storage level. In this
architecture, data can flow from different sources to a unified app and be
analyzed together without duplicating storage.
Data drift
Data drift refers to a change in data structure or meaning
that can occur over time and cause machine learning models to break. It occurs
frequently when ML models seek to describe continually changing (dynamic)
circumstances or environments.
Data virtualization
Data virtualization involves
creating virtual views of data stored in existing databases. The physical data
doesn’t move but you can still get an integrated view of the data in the new
virtual data layer. This is often called data federation (or virtual database),
and the underlying databases are the federates.
Data Migration
Data migration is the process of moving data from one system
to another. It is mostly used in the context of the extract/transform/load
(ETL) process. The extracted data needs to go through a series of functions in
preparation, transformation after which it can be loaded in to a target
location.
Data Democratization
Data democratization means that
everybody has access to data and there are no gatekeepers that create a
bottleneck at the gateway to the data and educate them on how to work
with data, regardless of their technical background.
Data Science
Data science combines math and
statistics, specialized programming, advanced analytics, artificial intelligence
(AI), and machine learning with specific subject matter expertise to uncover
actionable insights hidden in an organization’s data. These insights can be
used to guide decision making and strategic planning.
Data Visualization
Data
visualization is the graphical representation of information and data. By using
visual elements like charts, graphs, and maps, data visualization tools provide
an accessible way to see and understand trends, outliers, and patterns in data.
Additionally, it provides an excellent way for employees or business owners to
present data to non-technical audiences without confusion.
Data dictionary
Data dictionary is a collection of data elements in a
database or data model with detailed description of its format, relationships,
meaning, source and usage across an organization.
Enterprise Data Management (EDM)
Enterprise data management (EDM)
refers to a set of processes, practices, and activities focused on data
accuracy, quality, security, availability, and good governance.
Master Data Management(MDM)
Master data is all the data critical to the operation of a
business. This data is usually shared across the enterprise, and multiple
departments and personnel depend on it for decision-making.
Master data management (MDM) involves creating a single
master record for each person, place, or thing in a business, from across
internal and external data sources and applications. This information has been
de-duplicated, reconciled and enriched, becoming a consistent, reliable source.
Once created, this master data serves as a trusted view of business-critical
data that can be managed and shared across the business to promote accurate
reporting, reduce data errors, remove redundancy, and help workers make
better-informed business decisions.
Metadata
Metadata is simply data about data. It means it is a description and context of
the data. It helps to organize, find and understand data.
Data Modernization
Data modernization is the
process of transferring data to modern cloud-based databases from outdated or
siloed legacy databases, including structured and unstructured data. In that sense, data modernization is synonymous with cloud
migration
Data Architecture
Data architecture translates
business needs into data and system requirements and seeks to manage data and
its flow through the enterprise. A data architecture describes how data
is managed--from collection through to transformation, distribution, and
consumption. It sets the blueprint for data and the way it flows through data
storage systems. It define the respective data model and underlying data
structures, which support it. Modern data architectures often leverage
cloud platforms to manage and process data.
Data quality
Data
quality is an integral part of data governance that ensures that your
organization’s data is fit for purpose. It refers to the planning,
implementation, and control of activities that apply quality management
techniques to data, in order to assure it is fit for consumption and meet the
needs of data consumers.
Data observability
Data observability is the
ability to understand, diagnose, and manage data health across multiple IT
tools throughout the data lifecycle. A data observability platform helps
organizations to discover, triage, and resolve real-time data issues using
telemetry data like logs, metrics, and traces
Data Linegae
Data
lineage uncovers the life cycle of data—it aims to track the complete data
flow, from start to finish over time, providing a clear understanding
understanding, recording, and visualizing data as it flows from data sources to
consumption. This includes all transformations the data underwent along the
way—how the data was transformed, what changed, and why.
Data Privacy
Data privacy is a guideline for how data should be collected
or handled, based on its sensitivity and importance. Data privacy concerns
apply to all sensitive information that organizations handle, including that of
customers, shareholders, and employees. Often, this information plays a vital
role in business operations, development, and finances. Data privacy helps
ensure that sensitive data is only accessible to approved parties.
Data protection
Data protection is a set of strategies and processes
you can use to secure the privacy, availability, and integrity of your data. A
data protection strategy is vital for any organization that collects, handles,
or stores sensitive data. A successful strategy can help prevent data loss,
theft, or corruption and can help minimize damage caused in the event of a
breach or disaster.
Data Security
Data security is the practice of
protecting digital information from unauthorized access, corruption, or theft
throughout its entire lifecycle. It’s a concept that encompasses every aspect
of information security from the physical security of hardware and storage
devices to administrative and access controls, as well as the logical security
of software applications. It also includes organizational policies and procedures.
Data Encryption
Data Program Management (DPM) is the intelligent application
of data management tools, technologies, and processes to improve the usefulness
of an organization’s data
Data blending
Data blending is a process that allows the users to quickly
get value from multiple data sources by helping them see patterns.
Data Governance
Data governance is the collection of policies, processes and
standards that define how data assets can be used within an organization and
who has authority over them. Governance dictates who can use what data and in
what way.
Data catalog
A
data catalog is a comprehensive collection of an organization’s data assets,
which are compiled to make it easier for professionals across the organization
to locate the data they need.
Data Modeling
Data modeling is the process of
analyzing and defining all the different data your business collects and
produces, as well as the relationships between those bits of data. Data
modeling concepts create visual representations of data as it’s used at your
business, and the process itself is an exercise in understanding and clarifying
your data requirements.
Data munging
Data munging is the process of manual data cleansing prior to analysis. It is a
time-consuming process that often gets in the way of extracting true value and
potential from data
Data pipeline
A data pipeline is a sequence of steps that collect, process,
and move data between sources for storage, analytics, machine learning, or
other uses. For example, data pipelines are often used to send data from
applications to storage devices like data warehouses or data lakes.
Data profiling
Data profiling is the process of evaluating the contents and
quality of data. It is used to identify data quality issues at the start of a
data project and define what data transformation steps may be needed to bring
the dataset into a ready-to-use state.
Ad-hoc query
An ad-hoc query is a single-use query generally used to
answer “on-the-fly” business questions for which there are no pre-written
queries or standard procedures.
Batch processing
Batch processing refers to the scheduling and processing of
large volumes of data simultaneously, generally at periods of time when
computing resources are experiencing low demand. Batch jobs are typically
repetitive in nature and are often scheduled (automated) to occur at set
intervals
Data cleansing
Data cleansing, data cleaning or data scrubbing is the first
step in the overall data preparation process. It is the process of analyzing,
identifying and correcting messy, raw data. Data cleaning involves filling in
missing values, identifying and fixing errors
Data Wrangling
Data wrangling is the process of cleaning, structuring and
enriching raw data into a desired format for better decision making in less
time.
Data Masking
Data masking is the process of
replacing sensitive information copied from production databases to test
non-production databases with realistic, but scrubbed, data based on masking
rules. Data masking is ideal for virtually any situation when confidential or
regulated data needs to be shared with non-production users. These
non-production users need to access some of the original data, but do not need
to see every column of every table, especially when the information is
protected by government regulations
Data mining
Data mining is the process of finding anomalies, patterns and
correlations within large data sets to predict outcomes. Using a broad range of
techniques, you can use this information to increase revenues, cut costs,
improve customer relationships, reduce risks and more.
Data integration
Data integration is the process of combining data
from different sources into a single, unified view. Integration begins with the
ingestion process, and includes steps such as cleansing, ETL mapping, and
transformation.
SQL Performance Tuning
SQL
tuning is the process of improving SQL queries to accelerate your servers
performance. It's general purpose is to reduce the amount of time it takes a
user to receive a result after issuing a query, and to reduce the amount of
resources used to process a query. The lesson on subqueries introduced the idea
that you can sometimes create the same desired result set with a faster-running
query. In this lesson, you'll learn to identify when your queries can be
improved, and how to improve them.
Data Analytics
Data analytics analyzes internal and
external data to create value and actionable insights.
Data Fiduciary
“Data Fiduciary” means any person who alone or in conjunction with other persons determines the purpose and means of processing of personal data
Cloud data warehouse
A cloud data warehouse is a database that is managed as a
service and delivered by a third party, such as Google Cloud Platform (GCP),
Amazon Web Services (AWS), or Microsoft Azure. Cloud data architectures are
distinct from on-premise data architectures, where organizations manage their
own physical database infrastructure on their own premises.
Big Data
Big
data is a term that describes large, hard-to-manage volumes of data – both
structured and unstructured – that inundate businesses on a day-to-day basis.
But it’s not just the type or amount of data that’s important, it’s what
organizations do with the data that matters. Big data can be analyzed for
insights that improve decisions and give confidence for making strategic
business moves.
NoSQL
NoSQL databases (aka "not only SQL") are
non-tabular databases and store data differently than relational tables. NoSQL databases
come in a variety of types based on their data model. The main types are
document, key-value, wide-column, and graph. They provide flexible schemas and
scale easily with large amounts of data and high user loads. NoSQL databases are built from the ground up to store and
process vast amounts of data at scale and support a growing number of modern
businesses.
Business Intelligence
Business intelligence (BI) refers
to capabilities that enable organizations to make better decisions, take
informed actions, and implement more-efficient business processes. BI
keeps your organization in the know, and success depends in a large part on
knowing the who, what, where, when, why, and how of the market. Business intelligence tools analyze historical and current
data and present findings in intuitive visual formats.
Data
"Data” means a representation of information, facts, concepts, opinions or instructions in a manner suitable for communication, interpretation or processing by humans or by automated means.
Personal data
“Personal data” means any data about an individual who is identifiable by or in relation to such data;
Personal data breach
"Personal data breach" means any unauthorised processing of personal data or accidental disclosure, acquisition, sharing, use, alteration, destruction of or loss of access to personal data, that compromises the confidentiality, integrity or availability of personal data.
No comments:
Post a Comment