MySQL devs take cache behind shed, shot heard

Image result for MySQL devs take cache behind shed, shot heardThe developers of MySQL Server have decided its Query Cache feature is a bottleneck and killed it off.

Looking over the number of results (and the diversity of advice offered) if you search “tuning MySQL query cache”, it’s not entirely surprising.

The problem is scalability, as MySQL Server product manager Morgan Tocker writes here.

The operation of the cache looks simple enough: SELECT commands are stored in a hash table, and if a incoming query matches the hash, the server can return the results from the last time the query executed (with protection so the server doesn’t return stale results).

The problem, Tocker writes, is that the cache “is known not to scale with high-throughput workloads on multi-core machines”.

Even if that could be fixed, he continues, the fix wouldn’t make the query cache’s performance more predictable, and that’s often more important than peak throughput for user-facing systems.

Instead of persisting with fixing the cache, MySQL Server’s developers have decided “to invest in improvements that are more generally applicable to all workloads.”

Developers who need caching can use ProxySQL, and other users upgrading to MySQL 8.0 “will be encouraged to use either Server-side Query Rewrite”. ®

[“Source-ndtv”]

Take a closer look at your Spark implementation

Take a closer look at your Spark implementation

Apache Spark, the extremely popular data analytics execution engine, was initially released in 2012. It wasn’t until 2015 that Spark really saw an uptick in support, but by November 2015, Spark saw 50 percent more activity than the core Apache Hadoop project itself, with more than 750 contributors from hundreds of companies participating in its development in one form or another.

Spark is a hot new commodity for a reason. Its performance, general-purpose applicability, and programming flexibility combine to make it a versatile execution engine. Yet that variety also leads to varying levels of support for the product and different ways solutions are delivered.

While evaluating analytic software products that support Spark, customers should look closely under the hood and examine four key facets of how the support for Spark is implemented:

  • How Spark is utilized inside the platform
  • What you get in a packaged product that includes Spark
  • How Spark is exposed to you and your team
  • How you perform analytics with the different Spark libraries

Spark can be used as a developer tool via its APIs, or it can be used by BI tools via its SQL interface. Or Spark can be embedded in an application, providing access to business users without requiring programming skills and without limiting Spark’s utility through a SQL interface. I examine each of these options below and explain why all Spark support is not the same.

Programming on Spark

If you want the full power of Spark, you can program directly to its processing engine. There are APIs that are exposed through Java, Python, Scala, and R. In addition to stream and graph processing components, Spark offers a machine-learning library (MLlib) as well as Spark SQL, which allows data tools to connect to a Spark engine and query structured data, or programmers to access data via SQL queries they write themselves.

A number of vendors offer standalone Spark implementations; the major Hadoop distribution suppliers also offer Spark within their platforms. Access is exposed either through a command line or Notebook interface.

But performing analytics on core Spark with its APIs is a time-consuming, programming-intensive process. While Spark offers an easier programming model than, say, native Hadoop, it still requires developers. Even for organizations with developer resources, deploying them to work on lengthy data analytics projects may amount to an intolerable hidden cost. With many organizations, programming on Spark is not an actionable course for this reason.

BI on Spark

Spark SQL is a standards-based way to access data in Spark. It has been relatively easy for BI products to add support for Spark SQL to query tabular data in Spark. The dialect of SQL used by Spark is similar to that of Apache Hive, making Spark SQL akin to earlier SQL-on-Hadoop technologies.

Although Spark SQL uses the Spark engine behind the scenes, it suffers from the same disadvantages as Hive and Impala: Data must be in a structured, tabular format to be queried. This forces Spark to be treated as if it were a relational database, which cripples many of the advantages of a big data engine. Simply put, putting BI on top of Spark requires the transformation of the data into a reasonable tabular format that can be consumed by the BI tools.

Embedding Spark

Another way to leverage Spark is to abstract away its complexity by embedding it deep into a product and taking full advantage of its power behind the scenes. This allows users to leverage the speed and power of Spark without needing developers.

This architecture brings up three key questions. First, does the platform truly hide all of the technical complexities of Spark? As a customer, one needs to examine all aspects of how you would create each step of the analytic cycle — integration, preparation, analysis, visualization, and operationalization. A number of products offer self-service capabilities that abstract away Spark’s complexities, but others force the analyst to dig down and code — for example, in performing integration and preparation. These products may also require you to first ingest all your data into the Hadoop file system for processing. This adds extra length to your analytic cycles, creates fragile and fragmented analytic processes, and requires specialized skills.

Second, how does the platform take advantage of Spark? It’s critical to understand how Spark is used in the execution framework. Spark is sometimes embedded in a fashion that does not have the full scalability of a true cluster. This can limit overall performance as the volume of analytic jobs increases.

Third, how are you protected for the future? The strength of being tightly coupled with the Spark engine is also a weakness. The big data industry moves quickly. MapReduce was the predominant engine in Hadoop for six years. Apache Tez became mainstream in 2013, and now Spark has become a major engine. Assuming the technology curve continues to produce new engines at the same rate, Spark will almost certainly be supplanted by a new engine within 18 months, forcing products tightly coupled to Spark to be reengineered — a far from trivial undertaking. Even with that effort put aside, you must consider whether the redesigned product will be fully compatible with what you’ve built in the older version.

The first step to uncovering the full power of Spark is to understand that not all Spark support is created equal. It’s crucial that organizations grasp the differences in Spark implementations and what each approach means for their overall analytic workflow. Only then can they make a strategic buying decision that will meet their needs over the long haul.

Andrew Brust is senior director of market strategy and intelligence at Datameer.

 

 

[Source:- IW]

Take a closer look at your Spark implementation

Take a closer look at your Spark implementation

Apache Spark, the extremely popular data analytics execution engine, was initially released in 2012. It wasn’t until 2015 that Spark really saw an uptick in support, but by November 2015, Spark saw 50 percent more activity than the core Apache Hadoop project itself, with more than 750 contributors from hundreds of companies participating in its development in one form or another.

Spark is a hot new commodity for a reason. Its performance, general-purpose applicability, and programming flexibility combine to make it a versatile execution engine. Yet that variety also leads to varying levels of support for the product and different ways solutions are delivered.

While evaluating analytic software products that support Spark, customers should look closely under the hood and examine four key facets of how the support for Spark is implemented:

  • How Spark is utilized inside the platform
  • What you get in a packaged product that includes Spark
  • How Spark is exposed to you and your team
  • How you perform analytics with the different Spark libraries

Spark can be used as a developer tool via its APIs, or it can be used by BI tools via its SQL interface. Or Spark can be embedded in an application, providing access to business users without requiring programming skills and without limiting Spark’s utility through a SQL interface. I examine each of these options below and explain why all Spark support is not the same.

Programming on Spark

If you want the full power of Spark, you can program directly to its processing engine. There are APIs that are exposed through Java, Python, Scala, and R. In addition to stream and graph processing components, Spark offers a machine-learning library (MLlib) as well as Spark SQL, which allows data tools to connect to a Spark engine and query structured data, or programmers to access data via SQL queries they write themselves.

A number of vendors offer standalone Spark implementations; the major Hadoop distribution suppliers also offer Spark within their platforms. Access is exposed either through a command line or Notebook interface.

But performing analytics on core Spark with its APIs is a time-consuming, programming-intensive process. While Spark offers an easier programming model than, say, native Hadoop, it still requires developers. Even for organizations with developer resources, deploying them to work on lengthy data analytics projects may amount to an intolerable hidden cost. With many organizations, programming on Spark is not an actionable course for this reason.

BI on Spark

Spark SQL is a standards-based way to access data in Spark. It has been relatively easy for BI products to add support for Spark SQL to query tabular data in Spark. The dialect of SQL used by Spark is similar to that of Apache Hive, making Spark SQL akin to earlier SQL-on-Hadoop technologies.

Although Spark SQL uses the Spark engine behind the scenes, it suffers from the same disadvantages as Hive and Impala: Data must be in a structured, tabular format to be queried. This forces Spark to be treated as if it were a relational database, which cripples many of the advantages of a big data engine. Simply put, putting BI on top of Spark requires the transformation of the data into a reasonable tabular format that can be consumed by the BI tools.

Embedding Spark

Another way to leverage Spark is to abstract away its complexity by embedding it deep into a product and taking full advantage of its power behind the scenes. This allows users to leverage the speed and power of Spark without needing developers.

This architecture brings up three key questions. First, does the platform truly hide all of the technical complexities of Spark? As a customer, one needs to examine all aspects of how you would create each step of the analytic cycle — integration, preparation, analysis, visualization, and operationalization. A number of products offer self-service capabilities that abstract away Spark’s complexities, but others force the analyst to dig down and code — for example, in performing integration and preparation. These products may also require you to first ingest all your data into the Hadoop file system for processing. This adds extra length to your analytic cycles, creates fragile and fragmented analytic processes, and requires specialized skills.

Second, how does the platform take advantage of Spark? It’s critical to understand how Spark is used in the execution framework. Spark is sometimes embedded in a fashion that does not have the full scalability of a true cluster. This can limit overall performance as the volume of analytic jobs increases.

Third, how are you protected for the future? The strength of being tightly coupled with the Spark engine is also a weakness. The big data industry moves quickly. MapReduce was the predominant engine in Hadoop for six years. Apache Tez became mainstream in 2013, and now Spark has become a major engine. Assuming the technology curve continues to produce new engines at the same rate, Spark will almost certainly be supplanted by a new engine within 18 months, forcing products tightly coupled to Spark to be reengineered — a far from trivial undertaking. Even with that effort put aside, you must consider whether the redesigned product will be fully compatible with what you’ve built in the older version.

The first step to uncovering the full power of Spark is to understand that not all Spark support is created equal. It’s crucial that organizations grasp the differences in Spark implementations and what each approach means for their overall analytic workflow. Only then can they make a strategic buying decision that will meet their needs over the long haul.

 

[Source:- Infoworld]

 

Drones take off in plant ecological research

Image result for Drones take off in plant ecological research

Long-term, broad-scale ecological data are critical to plant research, but often impossible to collect on foot. Traditional data-collection methods can be time consuming or dangerous, and can compromise habitats that are sensitive to human impact. Micro-unmanned aerial vehicles (UAVs), or drones, eliminate these data-collection pitfalls by flying over landscapes to gather unobtrusive aerial image data.

A new review in a recent issue of Applications in Plant Sciences explores when and how to use drones in plant research. “The potential of drone technology in research may only be limited by our ability to envision novel applications,” comments Mitch Cruzan, lead author of the review and professor in the Department of Biology at Portland State University. Drones can amass vegetation data over seasons or years for monitoring habitat restoration efforts, monitoring rare and threatened plant populations, surveying agriculture, and measuring carbon storage. “This technology,” says Cruzan, “has the potential for the acquisition of large amounts of information with minimal effort and disruption of natural habitats.”

For some research questions, drone surveys could be the holy grail of ecological data. Drone-captured images can map individual species in the landscape depending on the uniqueness of the spectral light values created from plant leaf or flower colors. Drones can also be paired with 3D technology to measure plant height and size. Scientists can use these images to study plant health, phenology, and reproduction, to track disease, and to survey human-mediated habitat disturbances.

Researchers can fly small drones along set transects over study areas of up to 40 hectares in size. An internal GPS system allows drones to hover over pinpointed locations and altitudes to collect repeatable, high-resolution images. Cruzan and colleagues warn researchers of “shadow gaps” when collecting data. Taller vegetation can obscure shorter vegetation, hiding them from view in aerial photographs. Thus, overlapping images are required to get the right angles to capture a full view of the landscape.

The review lists additional drone and operator requirements and desired features, including video feeds, camera stabilization, wide-angle lenses for data collection over larger areas, and must-have metadata on the drone’s altitude, speed, and elevation of every captured image.

After data collection, georeferenced images are stitched together into a digital surface model (DSM) to be analyzed. GIS and programming software classify vegetation types, landscape features, and even individual species in the DSMs using manual or automated, machine-learning techniques.

To test the effectiveness of drones, Cruzan and colleagues applied drone technology to a landscape genetics study of the Whetstone Savanna Preserve in southern Oregon, USA. “Our goal is to understand how landscape features affect pollen and seed dispersal for plant species associated with different dispersal vectors,” says Cruzan. They flew drones over vernal pools, which are threatened, seasonal wetlands. They analyzed the drone images to identify how landscape features mediate gene flow and plant dispersal in these patchy habitats. Mapping these habitats manually would have taken hundreds of hours and compromised these ecologically sensitive areas.

Before drones, the main option for aerial imaging data was light detection and ranging (LiDAR). LiDAR uses remote sensing technology to capture aerial images. However, LiDAR is expensive, requires highly specialized equipment and flyovers, and is most frequently used to capture data from a single point in time. “LIDAR surveys are conducted at a much higher elevation, so they are not useful for the more subtle differences in vegetation elevation that higher-resolution, low-elevation drone surveys can provide,” explains Cruzan.

Some limitations impact the application of new drone technology. Although purchasing a robotic drone is more affordable than alternative aerial imaging technologies, initial investments can exceed US$1,500. Also, national flight regulations still limit drone applications in some countries because of changing licensing regulations and restricted flight elevations and flyovers near or on private lands. Also, if researchers are studying plant species that cannot be identified in aerial images using spectral light values, data collection on foot is required.

Despite limitations, flexibility is the biggest advantage to robotic drone research, says Cruzan. If the scale and questions of the study are ripe for taking advantage of drone technology, then “using a broad range of imaging technologies and analysis methods will improve our ability to detect, discriminate, and quantify different features of the biotic and abiotic environment.” As drone research increases, access to open-source analytical software programs and better equipment hardware will help researchers harness the advantages of drone technology in plant ecological research.

 

[Source:- SD]

SQL Server monitoring tools help DBAs take back nights and weekends

Many DBAs have to work more and more nights and weekends to fulfill zero-downtime demands. But tools for monitoring and managing SQL Server might free up some of that time.

Minimizing weekend and late-night work by database administrators was a quest near and dear to the hearts of many attendees at PASS Summit 2014in Seattle. At the conference, Thomas LaRock, president of the Professional Association for SQL Server (PASS) user group, explained the dilemma of the modern DBA: Routine maintenance, updates, patches and hardware replacements all require database downtime — but that’s unacceptable except at night or on weekends.

And as systems have grown more and more efficient, La Rock added, many database administrators (DBAs) are being asked to take on workloads that used to be spread across multiple individuals, requiring longer working hours to get everything done. “It’s only ever more, more, more,” he said in an interview.

“I’ve seen a huge growth rate of after-hours work that people are being demanded to do,” agreed Carl Berglund, director of business development at DH2i Co., a vendor of SQL Server monitoring tools and management software in Fort Collins, Colo. But, Berglund said, working more nights and weekends results in tired DBAs — and tired people make more mistakes.

One thing that potentially can help SQL Server DBAs cut down on their off-hours workloads is the database performance monitoring and managementsoftware sold by Microsoft and various third-party vendors. For example, LaRock works as database management “head geek” at SolarWinds, a vendor in Austin, Texas, that offers a tool called Database Performance Analyzer (DPA). The product, which SolarWinds acquired when it bought Confio Software last year, tracks and analyzes the “wait time” in applications running on top of a database. DPA pinpoints processes that are causing holdups and provides guidance on how to alleviate the problems and speed up processing time.

DH2i has also developed tools specifically intended to combat lost nights and weekends for DBAs. Berglund presented on that topic at a SQLSaturday conference held by PASS in Orlando, Fla., in September. DH2i’s strategy is designed to provide application mobility and infrastructure independence, enabling SQL Server instances to be updated on a new virtual host with little downtime. DBAs “can do the majority of [updates and patching] in the daytime and just a stop and restart at night,” Berglund said.

New tool saves time — for other tasks

Cindy Osborn, SQL Server technology lead and SQL architect at International Paper Co. in Memphis, Tenn., is a user of the SolarWinds DPA tool. She started a trial of the software in June, when it was still known as Confio Ignite. Now Osborn uses it on a regular basis as she manages 100 instances of SQL Server for the global paper and packaging manufacturer. She found DPA while looking for a monitoring tool to help her analyze database performance and deal with code issues in applications.

Previously, whenever one of the multiple software development groups at International Paper had a problem, Osborn’s DBA team was forced to drop everything to “stop and dig,” as she described it. The developers couldn’t work on fixing the problems themselves without being given elevated access to the servers running the databases, which could cause security issues. In one case, Osborn had to do hours of code tweaking to get a homegrown incident tracking application to work correctly. At other times, she said, it took her “hours upon hours” to tell software vendors what was wrong with their applications.

With the SolarWinds tool, Osborn said her team can pinpoint problems more quickly and reduce interruptions to their usual DBA work. Another benefit, she added, is that graphs generated by DPA as part of reports on performance problems are easy for business users to understand. “Now, I don’t get, ‘Your server is slow,’ ” she said, describing phone calls with users.

Osborn is still working off-hours, but she said that DPA has helped her reduce her workload by enabling her to consolidate some SQL Server instances. Thanks to the database consolidation, she now can run SQL Server on fewer processor cores — and with fewer cores and instances to manage, Osborn has freed up some time. Much of it was taken up by other tasks, but she has noticed a decrease in night and weekend overtime.

On-call DBA hours hard to endure

Andrea Letourneau stopped working as a DBA for financial services technology provider Fiserv Inc. after her experience in an on-call position there. Letourneau, who now is a developer and database specialist at Viewpoint Construction Software in Portland, Ore., said that she was one of two people working as on-call DBAs at Fiserv, which meant she had to be available to take calls from its customers more than 50% of the time. “My husband got sick of the 3 a.m. phone calls,” she said.

While Letourneau has exchanged the duties of a DBA for writing custom code, she did offer some strategies for minimizing night and weekend work time. She said she checked system usage trends before the weekend so she could see if more hardware would be needed and put in a request to IT before it became a problem, thus cutting down on emergency calls. She also made manual checks of the database servers part of her daily routine and especially monitored disk space to make sure there was plenty of room.

Letourneau added that DBAs now are able to use SQL Server monitoring tools from vendors like SolarWinds, SQL Sentry and Idera to help them with that process. “It’s definitely come a long way,” she said. “If you have good monitoring software and a good DBA doing the monitoring, it helps.”

 

[Source:- techtarget]

Make room, Java: New languages take a slice of the pie

How Apple is grooming the iPad to take over the Mac

As Apple’s first truly new tablet in years, the iPad Pro is every bit the beast it was rumored to be. Built around a gorgeous 12.9-inch screen, it doesn’t skimp on the pixels or the power. With its eye-popping 2732-by–2048 resolution and the cutting-edge A9X chip, the big iPad Pro drew a clear line in the sand between it and the Air. As Steve Jobs might have said, it was a screamer.

With the new 9.7-inch iPad Pro, however, Apple has blurred those lines a bit. With the same screen size and resolution as its predecessor, the smaller iPad Pro might seem like an inevitable evolutionary step, an update that ticks off the usual boxes and offers basic incentives to upgrade over previous generations. But by elevating the classic form factor, Apple is grooming the iPad to one day replace the Mac, creating a diverse family with clear differences that belie the natural overlap between models.

Like a pro

In many ways, the iPad Pro is Apple’s first tablet with an identity all its own. A unique device that raises the post-PC bar, it establishes a new set of standards for what an iPad can do. Where the original iPad was criticized for being a giant iPhone, the Pro is much more than a refresh of the classic tablet; it’s Apple’s first touchscreen device truly imagined for professionals.

ipad pro 97 review 2016ADAM PATRICK MURRAY
It’s not a toy.

Before the Pro, the iPad was viewed mostly as a companion device, more than capable of performing a variety of tasks but still seen as needing to defer to the Mac for longer, labor-intensive projects. The Pro has removed much of that perception. While Apple has made it clear that it won’t be building a hybrid machine anytime soon, the new iPad Pro is an important step in the post-PC march, one that brings it closer to replacing the Mac as our most capable device.

Breaking the trend

Apple’s idea of pro has never had much to do with screen size. While there’s a general rule that larger screens equate with more powerful processors, Apple never assumed professional Mac users preferred them. Case in point: at the same time a monstrous 17-inch PowerBook was released in 2003, a diminutive 12-inch model also made its debut, and both were geared toward professionals on the move.

But the assumption that the 12.9-inch screen would be the main distinguisher between the Pro and the Air made sense in light of the iPad mini and the iPhone Plus. Apple’s iOS naming scheme has always been contingent on the size of the screen—Plus at 5.5 inches, mini at 7.9, Air at 9.7—and without any BTO options for processor or RAM, going Pro was primarily a decision to opt for more pixels.

ipad pro pairAPPLE
One Pro, two sizes—but Apple’s keeping the 9.7-inch iPad Air 2 in the lineup as well, at least for now.

But now there’s a smaller Pro, and for the first time there are two distinct models of the same iPad to choose from. If the the iPad Air 2 isn’t left to languish and die, the iPad line begins to look very much like the MacBook one, with numerous options in the middle to fit various needs. Where the 7.9- and 12.9-inch models will likely continue to bookend the iPad line and appeal to specific niches of buyers, the 9.7-inch iPad Pro is the breakout star, with models differentiated by performance and expansion rather than size.

Expansion pack

When pitting the new iPad Pro against the iPad Air 2 (which is still on display at the Apple Store), there is a noticeable speed difference, not unlike the one you’ll encounter when switching from a MacBook Air to a MacBook Pro. Applications open faster, Split View is snappier, and the overall performance enhancement improves the whole experience.

ipad pro 97 review 2016ADAM PATRICK MURRAY
Even without accessories, the iPad lets you do more. You would never shoot video with your Mac, for example, but the 9.7-inch iPad Pro can shoot in 4K, and then edit the video too.

But the beauty and the power of the PC has always been its expansion capabilities. As far back as the Macintosh 128K, Apple has cultivated a close-knit community of peripheral device makers, but the iPad’s add-ons have mostly been limited to cases and covers. Much like the MacBook Pro offers ports that the Air and the MacBook don’t, the Smart Connector and Apple Pencil are the main step-up features for professional Multi-Touchers, the first tablet accessories built to truly expand the capabilities of the iPad. The Smart Connector changes that by opening up the iPad to a world of expansion, and hopefully it won’t be long before hubs, docks and hard drives are available, further blurring the division between it and the Mac.

Closing the gap

With the iPad Pro, Apple finally has a post-PC device that can actually replace a PC. iOS still pales in comparison to OS X, but with the latest multitasking capabilities, that performance gulf is becoming less of an issue. And I think iOS 10 will only continue the shift away from the iPhone.

9 ipad pro pencil stockAPPLE
The Pencil is a natural for artists, but also anyone who just thinks better using pencil and paper.

With the Smart Keyboard and Apple Pencil, Apple has two tools that can give iPad users smarter, faster ways to navigate and multitask, spending less time tapping the screen and more time working. And apps, too, could use a boost. The most glaring omission is Xcode, which has always been tied to the Mac. Porting its integrated development environment to the iPad wouldn’t just give coders a break; it would pave the way for powerful desktop apps to make their way to iOS without sacrificing features or dumbing down the interface.

If the iPad mini was a concentration, the iPad Pro is a maturation. To fight flagging sales, Apple has doubled-down on the iPad, following the blueprint created by the Mac to build a diverse, versatile line of tablets able to handle anything you can throw at them. The post-PC revolution is far from over. But the Mac might be running out of weapons to fend it off.

 
[Source:- Macworld]