Data Mining Techniques for Detecting Household Characteristics Based on Smart Meter Data

Department of Informatics, Faculty of Applied Informatics and Mathematics, Warsaw University of Life Sciences, Nowoursynowska 159, 02-787 Warsaw, Poland* Author to whom correspondence should be addressed.
Academic Editor: Thorsten Staake
Received: 16 April 2015 / Revised: 3 June 2015 / Accepted: 6 July 2015 / Published: 22 July 2015


The main goal of this research is to discover the structure of home appliances usage patterns, hence providing more intelligence in smart metering systems by taking into account the usage of selected home appliances and the time of their usage. In particular, we present and apply a set of unsupervised machine learning techniques to reveal specific usage patterns observed at an individual household. The work delivers the solutions applicable in smart metering systems that might: (1) contribute to higher energy awareness; (2) support accurate usage forecasting; and (3) provide the input for demand response systems in homes with timely energy saving recommendations for users. The results provided in this paper show that determining household characteristics from smart meter data is feasible and allows for quickly grasping general trends in data.


Streamlining Smart Meter Data Analytics

Streamlining Smart Meter Data Analytics

Xiufeng Liu and Per Sieverts Nielsen Technical University of Denmark

{xiuli, pernn}@dtu.dk

Abstract. Todaysmartmetersareincreasinglyusedinworldwide.Smartmetersaretheadvancedmetersca- pable of measuring customer energy consumption at a fine-grained time interval, e.g., every 15 minutes. The data are very sizable, and might be from different sources, along with the other social-economic metrics such as the geographic information of meters, the information about users and their property, geographic location and others, which make the data management very complex. On the other hand, data-mining and the emerging cloud computing technologies make the collection, management, and analysis of the so-called big data possible. This can improve energy management, e.g., help utilities improve the management of energy and services, and help customers save money. As this regard, the paper focuses on building an innovative software solution to stream- line smart meter data analytic, aiming at dealing with the complexity of data processing and data analytics. The system offers an information integration pipeline to ingest smart meter data; scalable data processing and analytic platform for pre-processing and mining big smart meter data sets; and a web-based portal for visualiz- ing data analytics results. The system incorporates hybrid technologies, including big data technologies Spark and Hive, the high performance RDBMS PostgreSQL with the in-database machine learning toolkit, MADlib, which are able to satisfy a variety of requirements in smart meter data analytics.

Keywords: Streamline, Software Platform, Smart meter data, Data Analytics



Clustering Time‐Series Energy Data from Smart Meters

Alexander Lavin1, Diego Klabjan2


Investigations have been performed into using clustering methods in data mining time‐series data from smart meters. The problem is to identify patterns and trends in energy usage profiles of commercial and industrial customers over 24‐ hour periods, and group similar profiles. We tested our method on energy usage data provided by several U.S. power utilities. The results show accurate grouping of accounts similar in their energy usage patterns, and potential for the method to be utilized in energy efficiency programs.




Utilities ‘Very Excited’ about Mining Smart Meter Data and Invading Your Privacy

Posted on May 19, 2015  by SkyVision Solutions,  K.T. Weaver 





Applying Smart Meter and Data Mining Techniques to Predict Refrigeration System Performance

Jui-Sheng Chou , Anh-Duc Pham


A major challenge in many countries is providing sufficient energy for human beings and for supporting economic activities while minimizing social and environmental harm. This study predicted coefficient of performance (COP) for refrigeration equipment under varying amounts of refrigerant (R404A) with the aids of data mining (DM) techniques. The performance of artificial neural networks (ANNs), support vector machines (SVMs), classification and regression tree (CART), multiple regression (MR), generalized linear regression (GLR), and chi-squared automatic interaction detector (CHAID) were applied within DM process. After obtaining the COP value, abnormal equipment conditions can be evaluated for refrigerant leakage. Analytical results from cross-fold validation method are compared to determine the best models. The study shows that DM techniques can be used for accurately and efficiently predicting COP. In the liquid leakage phase, ANNs provide the best performance. In the vapor leakage phase, the best model is the GLR model. Experimental results confirm that systematic analyses of model construction processes are effective for evaluating and optimizing refrigeration equipment performance.



Cluster Analysis of Smart Metering Data An Implementation in Practice

Authors :  Dipl.-Wi.-Ing. Christoph Flath Dipl.-Wi.-Ing. David Nicolay, Dr. Tobias Conte, PD Dr. Clemens van Dinther, Dr. Lilia Filipova-Neumann

Research Center for Information Technology
Haid-und-Neu-Str. 10–14, 76131 Karlsruhe Germany

Utilities and electricity retailers can benefit from the introduction of smart meter technology through process and service innovation. In order to offer customer specific services, smart meter mass data has to be analyzed. In the article we show how to integrate cluster analysis in a business Intelligence environment and apply cluster analysis to real smart meter data to identify detailed customer clusters.


A Data Mining Framework for Electricity Consumption Analysis From Meter Data

Article in IEEE Transactions on Industrial Information – September 2011


This paper presents a novel data mining framework for the exploration and extraction of actionable knowledge from data generated by electricity meters. Although a rich source of information for energy consumption analysis, electricity meters produce a voluminous, fast-paced, transient stream of data that conventional approaches are unable to address entirely. In order to overcome these issues, it is important for a data mining framework to incorporate functionality for interim summarization and incremental analysis using intelligent techniques. The proposed Incremental Summarization and Pattern Characterization (ISPC) framework demonstrates this capability. Stream data is structured in a data warehouse based on key dimensions enabling rapid interim summarization. Independently, the IPCL algorithm incrementally characterizes patterns in stream data and correlates these across time. Eventually, characterized patterns are consolidated with interim summarization to facilitate an overall analysis and prediction of energy consumption trends. Results of experiments conducted using the actual data from electricity meters confirm applicability of the ISPC framework.




In-Home Surveillance Using Smart Meters. Privacy, Data Mining and Health Impacts

By Josh del Sol, Global Research, April 17, 2014

 A look at what utility companies, PUCs, and the former CIA director have to say about the ‘smart’ meters, data-mining, and surveillance — sans propaganda.


It’s always a drag to find out when a friend is saying one thing to your face, and another to your back.  As uncovered in our film Take Back Your Power, the way in which most utilities are now delivering the lies and propaganda — with your individual rights, security, and potentially health on the line — is elevating the trait of “two-faced” to a completely new level.

It’s important to note that the first 4 of these references have to do with the smart meters / grid infrastructure capabilities as of this time.  According to the sum of my research over the past 3 years, the plan involves achieving a greater and greater level of granularity and extraction of in-home data over time — see #5 and #6 below as examples (as well as my article on Google’s Nest acquisition).  So as far as privacy and surveillance go, according to utilities’ own documentation and writings, ‘smart’ meters are effectively a trojan horse.

Read More 




Data is ALWAYS worth money.  Companies pay BIG money for information on your buying habits, for your email address, and more. Technically, according to one of the Smart Meter experts I talked with, Smart Meters could track what TV shows you watch, and potentially what food you eat, if your refrigerator is equipped with “Smart” technology.”  – Smart Meters Exposed


Researchers find smart meters could reveal favorite TV shows

Tests on smart meters made by German company Discovergy show that someone with network sniffing skills and equipment could determine what’s been watched by looking at lighting display patterns.

“”Our test results show that two 5-minute chunks of consecutive viewing without major interference by other appliances is sufficient to identify the content,” Loehr and his fellow researchers–Ulrich Greveler and Benjamin Justus–wrote in their paper, to be presented Wednesday at the Computers, Privacy and Data Protection conference in Brussels.”

“The data is exposed because it is not signed or encrypted, Loehr said in an interview with CNET. “Anyone with access to your home network has access to this data,” he said.”


source : http://www.cnet.com/news/researchers-find-smart-meters-could-reveal-favorite-tv-shows/




Onzo looks for the details in the Data

 from 4:03-5:24,   explains the concerns about data mining from Onzo


How Privacy (Or Lack of It) Could Sabotage the Grid

Nov 3, 2009

By Jules Polonetsky and Christopher Wolf

In October, President Obama announced $3.4 billion in federal grants to help build our nation’s Smart Grid.  The President said that the technology that will make up the Smart Grid will make the nation’s power transmission system more efficient, encourage renewable energy sources and give consumers better control over their electricity usage and costs.
The potential benefits are clear.  Far less obvious to many is that the smart power grid is also a smart information grid, a system that Cisco’s CEO has predicted will be bigger than the Internet. But while Internet privacy issues are limited to the Web activities of users, the Smart Grid will involve the collection of information about what goes on at people’s homes.  As Commerce Secretary Gary Locke stated this September, “The major benefit provided by the Smart Grid… is also its Achilles’ heel from a privacy viewpoint.”

This fall, the National Institute of Standards and Technology (NIST) identified several potential data privacy concerns involving Smart Grid technology. They include the threat of identity theft, the possibility of personal behavioral patterns being recorded and real-time surveillance.

Clearly, a significant amount of new and intimate consumer data will be available through Smart Grid technology.  There are numerous potential users of the data, including utility companies, smart appliance manufacturers, and third parties that may want the data for further consumer interactions.  Moreover, data that can be collected through smart meters and integrated home networks and appliances has significant value.  For example, Smart Grid systems may incorporate advanced broadband and data flow metering functionality, which can collect information about how much electricity an individual uses, which rooms he or she uses most, when, and how often.  Armed with this data, utility companies will be able to manage load requirements better and create a more efficient electricity distribution system.  In addition, device manufacturers will be able to understand better how their devices are used, allowing them to serve their customers better.  These Smart Grid features, however, raise questions about which entities will have access to individual user data and whether individual devices may be identified or tracked.

Potential Smart Grid data users, including utility companies and device manufacturers, must engage in responsible data management practices that build consumer confidence and trust.  Such trust can only be achieved if consumers feel that they are receiving sufficient information about and are in control of how their personal Smart Grid data is used.  Thus, Smart Grid data users must consider carefully how they will protect the integrity, privacy, and security of the Smart Grid data obtained from consumer usage patterns.  In addition, Smart Grid data must be gathered responsibly, securely, and with a measure of transparency and consumer control.

Only if consumers have confidence about how their data is used will there be the critical growth in Smart Grid technologies.  An individual consumer must be assured that information about his or her behavioral habits will be used only for the purposes understood and agreed to by that consumer and that it will be protected from improper use.  Without such responsible data management practices, there likely will be consumer resistance to Smart Grid technologies and a loss of consumer trust that could hinder Smart Grid deployment efforts, leading to lower demand for new products and reduced innovation.

Utility regulators and government policy-makers have highlighted the need for customer permission in using data from the Smart Grid.  However, requesting permission for data use and even communicating data management policies to users can be challenging.  Industry, academia and policy groups need to begin the research to determine how best to convey information to users regarding the privacy decisions they will make in incorporating Smart Grid technologies into their homes and lives.   Utilities and manufacturers should integrate the principles of Privacy by Design, a concept pioneered by Ontario Privacy Commissioner Ann Cavoukian, into the construction of their data infrastructures.

Taking key privacy concerns into account, before millions of dollars are spent, will ensure that the Smart Grid safeguards the future of our power system and our personal privacy.


The Future of Privacy Forum (FPF) is a Washington, DC based think tank that seeks to advance responsible data practices. The forum is led by Internet privacy experts Jules Polonetsky (former CPO to AOL and DoubleClick) and Christopher Wolf (Partner and chair of the privacy practice at Hogan and Hartson, LLP).  FPF also maintains an advisory board comprised of leading figures from industry, academia, law and advocacy groups. FPF was launched in November 2008.  More information about FPF is available at www.futureofprivacy.org


Also on SGN:

The Smart Grid and Consumers: Unanswered Questions

The Dangers of Meter Data (Part 1)

Data Privacy and Security Issues for Advanced Metering Systems (Part 2)


The Policy of Privacy: This article gets to the root of one of my key policy concerns. Energy usage data should NOT be protected by conventional privacy laws in the same way that personal data is protected. Why? Because the nation’s overall energy stores are a public good and underpin a far-reaching national security concern. The wise use of this limited resource is paramount to our economic dominance globally and to our quality of life locally.

Energy usage data at the meter level should be public information, much like how campaign contributions are mandatorily disclosed or chemical emissions are regulated. In fact, the energy consumption of buildings (and for that matter vehicles) should be tracked just like the square footage of a building on the tax assessment. A building’s energy consumption pattern directly affects both is value and external cost to society. If we could figure out the right balance to this policy stumbling block, a whole new wave of innovation in conservation would likely flow from it.  Mark Huppert – 11/11/2009

The Policy of Privacy: Whilst I agree with Mark Huppert’s overall sentiment about how important the use of Energy is in respect to the overall ecological balance, I take real issue with his bucketing of all energy usage data. I have no problems in the use of energy being recorded, or by what building over a period (of e.g. 24 hours) What I feel is an infringement of Privacy, that could be further eroded by the bland concatenation of Energy Usage as a reason for segregating Engery Privacy from other Privacy regulations is that Energy usage can be now identified to who does what when with what. fat’s not an invasion of privacy I don’t know what is. In addition, it goes further, with smart meters a decision can be made centrally that XX Utility doesn’t want YY using electricity to cool their home down to 68 degrees, and via a smart meter can either decide to turn of the A/C or turn up the thermostat to 73 (for example) Not only is this an invasion of privacy, this is tantamount to Big Brother (1984) and moreover could lead to lawsuits should a medical condition be triggered by this remote operation. Mark might counter by saying that special cases like this could be notified to the “Central Command”, but then isn’t that putting Privacy further at risk because then Medical Data (given the example in question) which is covered by existing Privacy Legislation would need to be accessible by someone outside of the medical profession in order to ensure correct decisions are being made by a central body on behalf of a third party, who may not want that intervention in the first place.

Just a thought for discussion.  Len Inkster – 06/28/2010

How Privacy (Or Lack of It) Could Sabotage the Grid by Jules Polonetsky and Christopher Wolf – SmartGridNews.com – Nov.03, 2009:



Privacy commissioner investigates BC Hydro smart meters

The Canadian Press   Jul 28, 2011 5:04 PM


Privacy Commission of BC

Office of the Information and Privacy Commission of BC

[email protected]


California Adopts Smart Meter Data Privacy Rules

By a 5-0 vote, California’s Public Utility Commission has unanimously adopted the world’s first comprehensive set of rules assuring consumers can access the detailed energy usage data provided by their smart meter while simultaneously protecting the data’s privacy and security.

The CPUC’s decision applies to the state’s three largest electric utilities—Pacific Gas & Electric (PG&E), San Diego Gas & Electric (SDG&E), and Southern California Edison (SCE)—which serve eight out of 10 Californians and which combined have deployed approximately eight million smart meters, with the final three million to be installed by the end of 2012.

The decision means the utilities must provide daily updates on detailed energy usage, bill-to-date, month-end bill forecast, and projected month-end energy price, to be made available on the companies’ respective web sites. This rule is similar to what Westar Energy will start providing to its SmartStar customers in Kansas.

Another provision requires tier alerts. In California, consumers pay more per kilowatt the more energy they use, with pricing rates divided into five tiers. When customers move from one price tier to the next, the utilities need to send notifications. The Commission says the notifications can be made “via e-mail, text message, tweet, chat, or some other form of rapid communication.” PG&E already provides this service.

All residents and businesses served by California’s three largest utilities have the option of switching to a time-of-use rate. So the Commission is requiring consumers be provided with a rate calculator to help them determine whether they would save money by switching to a time-of-use rate. This tool would use an individual customer’s data as collected by the utility.

The smart meters installed by California’s big three power utilities contain a Home Area Network interface, basically a radio that uses the ZigBee standard for transmitting data to homes and businesses. Currently, the interface has not been activated. But the CPUC decision requires each utility to file plans that “include an initial phase with a rollout that enables a minimum of 5,000 HAN-enabled devices to be directly connected with smart meters, as envisioned in the decisions approving the deployment of AMI (Advanced Metering Infrastructure)— even if full functionality and rollout to all customers awaits resolution of technology and standard issues.”

In addition, consumers will be able to authorize third parties to receive their backhauled smart meter data directly from the utility to support a variety of services including energy efficiency and demand response. The three major utilities will submit applications to the Commission with specific plans, including which standards they will use, expected to be the Open Automated Data Exchange (OpenADE) standard that is in its final development with NIST’s Smart Grid Interoperability Panel and the North American Energy Standards Board. The utilities, the Commission notes, “will bear no new liability for the actions of third parties” that acquire the information.

But to better protect consumer privacy and data security, the CPUC will have jurisdiction over third parties who receive data, whether gotten when providing services to utilities, or when authorized by consumers. However, the CPUC will not exercise jurisdiction over third parties who receive energy usage data directly from a device installed at residence or business that receives data via the HAN interface.

In presenting its ruling, the CPUC said it relied primarily on existing privacy law, using the Fair Information Practice Principles, developed by the United States Department of Homeland Security as its privacy framework.