Researchers from the UGR develop a new software which adapts medical technology to see the interior of a sculpture

Researchers from the UGR develop a new software which adapts medical technology to see the interior of a sculpture

A student at the University of Granada (UGR) has designed software that adapts current medical technology to analyze the interior of sculptures. It’s a tool to see the interior without damaging wood carvings, and it has been designed for the restoration and conservation of the sculptural heritage.

Francisco Javier Melero, professor of Languages and Computer Systems at the University of Granada and director of the project, says that the new software simplifies medical technology and adapts it to the needs of restorers working with wood carvings.

The software, called 3DCurator, has a specialized viewfinder that uses computed tomography in the field of restoration and conservation of sculptural heritage. It adapts the medical CT to restoration and it displays the 3-D image of the carving with which it is going to work.

Replacing the traditional X-rays for this system allows restorers to examine the interior of a statue without the problem of overlapping information presented by older techniques, and reveals its internal structure, the age of the wood from which it was made, and possible additions.

“The software that carries out this task has been simplified in order to allow any restorer to easily use it. You can even customize some functions, and it allows the restorers to use the latest medical technology used to study pathologies and apply it to constructive techniques of wood sculptures,” says professor Melero.

 

This system, which can be downloaded for free from www.3dcurator.es, visualizes the hidden information of a carving, verifies if it contains metallic elements, identifies problems of xylophages like termites and the tunnel they make, and detects new plasters or polychrome paintings added later, especially on the original finishes.

The main developer of 3DCurator was Francisco Javier Bolívar, who stressed that the tool will mean a notable breakthrough in the field of conservation and restoration of cultural assets and the analysis of works of art by experts in Art History.

Professor Melero explains that this new tool has already been used to examine two sculptures owned by the University of Granada: the statues of San Juan Evangelista, from the 16th century, and an Immaculate from the 17th century, which can be virtually examined at the Virtual Heritage Site Of the Andalusian Universities (patrimonio3d.ugr.es/).

 

 

[Source:- Phys.org]

 

Upcoming Windows 10 update reduces spying, but Microsoft is still mum on which data it specifically collects

Privacy-2-1024x812

There’s some good news for privacy-minded individuals who haven’t been fond of Microsoft’s data collection policy with Windows 10. When the upcoming Creators Update drops this spring, it will overhaul Microsoft’s data collection policies. Terry Myerson, executive vice president of Microsoft’s Windows and Devices Group, has published a blog post with a list of the changes Microsoft will be making.

First, Microsoft has launched a new web-based privacy dashboard with the goal of giving people an easy, one-stop location for controlling how much data Microsoft collects. Your privacy dashboard has sections for Browse, Search, Location, and Cortana’s Notebook, each covering a different category of data MS might have received from your hardware. Personally, I keep the Digital Assistant side of Cortana permanently deactivated and already set telemetry to minimal, but if you haven’t taken those steps you can adjust how much data Microsoft keeps from this page.

Second, Microsoft is condensing its telemetry options. Currently, there are four options — Security, Basic, Enhanced, and Full. Most consumers only have access to three of these settings — Basic, Enhanced, and Full. The fourth, security, is reserved for Windows 10 Enterprise or Windows 10 Education. Here’s how Microsoft describes each category:

Security: Information that’s required to help keep Windows, Windows Server, and System Center secure, including data about the Connected User Experience and Telemetry component settings, the Malicious Software Removal Tool, and Windows Defender.

Basic: Basic device info, including: quality-related data, app compatibility, app usage data, and data from the Security level.

Enhanced: Additional insights, including: how Windows, Windows Server, System Center, and apps are used, how they perform, advanced reliability data, and data from both the Basic and the Security levels.

Full: All data necessary to identify and help to fix problems, plus data from the Security, Basic, and Enhanced levels.

That’s the old system. Going forward, Microsoft is collapsing the number of telemetry levels to two. Here’s how Myerson describes the new “Basic” level:

[We’ve] further reduced the data collected at the Basic level. This includes data that is vital to the operation of Windows. We use this data to help keep Windows and apps secure, up-to-date, and running properly when you let Microsoft know the capabilities of your device, what is installed, and whether Windows is operating correctly. This option also includes basic error reporting back to Microsoft.

Windows 10 will also include an enhanced privacy section that will show during start-up and offer much better granularity over privacy settings. Currently, many of these controls are buried in various menus that you have to manually configure after installing the operating system.

It’s nice that Microsoft is cutting back on telemetry collection at the basic level. The problem is, as Stephen J Vaughn-Nichols writes, Microsoft is still collecting a creepy amount of information on “Full,” and it still defaults to sharing all this information with Cortana — which means Microsoft has data files on people it can be compelled to turn over by a warrant from an organization like the NSA or FBI. Given the recent expansion of the NSA’s powers, this information can now be shared with a variety of other agencies without filtering it first. And while Microsoft’s business model doesn’t directly depend on scraping and selling customer data the way Google does, the company is still gathering an unspecified amount of information. Full telemetry, for example, may “unintentionally include parts of a document you were using when a problem occurred.” Vaughn-Nichols isn’t thrilled about that idea, and neither am I.

The problem with Microsoft’s disclosure is it mostly doesn’t disclose. Even basic telemetry is described as “includes data that is vital to the operation of Windows.” Okay. But what does that mean?

I’m glad to see Microsoft taking steps towards restoring user privacy, but these are small steps that only modify policies around the edges. Until the company actually and meaningfully discloses what telemetry is collected under Basic settings and precisely what Full settings do and don’t send in the way of personally identifying information, the company isn’t explaining anything so much as it’s using vague terms and PR in place of a disclosure policy.

As I noted above, I’d recommend turning Cortana (the assistant) off. If you don’t want to do that, you should regularly review the information MS has collected about you and delete any items you don’t want to part of the company’s permanent record.

 

 

[Source:- Extremetech]

Which freaking big data programming language should I use?

Which freaking big data programming language should I use?

You have a big data project. You understand the problem domain, you know what infrastructure to use, and maybe you’ve even decided on the framework you will use to process all that data, but one decision looms large: What language should I choose? (Or perhaps more pointed: What language should I force all my developers and data scientists to suffer?) It’s a question that can be put off for only so long.

Sure, there’s nothing stopping you from doing big data work with, say, XSLT transformations (a good April Fools’ suggestion for tomorrow, simply to see the looks on everybody’s faces). But in general, there are three languages of choice for big data these days — R, Python, and Scala — plus the perennial stalwart enterprise tortoise of Java. What language should you choose and why … or when?

Here’s a rundown of each to help guide your decision.

R

R is often called “a language for statisticians built by statisticians.” If you need an esoteric statistical model for your calculations, you’ll likely find it on CRAN — it’s not called the Comprehensive R Archive Network for nothing, you know. For analysis and plotting, you can’t beat ggplot2. And if you need to harness more power than your machine can offer, you can use the SparkR bindings to run Spark on R.

However, if you are not a data scientist and haven’t used Matlab, SAS, or OCTAVE before, it can take a bit of adjustment to be productive in R. While it’s great for data analysis, it’s less good at more general purposes. You’d construct a model in R, but you would consider translating the model into Scala or Python for production, and you’d be unlikely to write a clustering control system using the language (good luck debugging it if you do).

Python

If your data scientists don’t do R, they’ll likely know Python inside and out. Python has been very popular in academia for more than a decade, especially in areas like Natural Language Processing (NLP). As a result, if you have a project that requires NLP work, you’ll face an embarrassing number of choices, including the classic NTLK, topic modeling with GenSim, or the blazing-fast and accurate spaCy. Similarly, Python punches well above its weight when it comes to neural networking, withTheano and Tensorflow; then there’s scikit-learn for machine learning, as well asNumPy and Pandas for data analysis.

There’s Juypter/iPython too — the Web-based notebook server that allows you to mix code, plots, and, well, almost anything, in a shareable logbook format. This had been one of Python’s killer features, although these days, the concept has proved so useful that it has spread across almost all languages that have a concept of Read-Evaluate-Print-Loop (REPL), including both Scala and R.

Python tends to be supported in big data processing frameworks, but at the same time, it tends not to be a first-class citizen. For example, new features in Spark will almost always appear at the top in the Scala/Java bindings, and it may take a few minor versions for those updates to be made available in PySpark (especially true for the Spark Streaming/MLLib side of development).

As opposed to R, Python is a traditional object-oriented language, so most developers will be fairly comfortable working with it, whereas first exposure to R or Scala can be quite intimidating. A slight issue is the requirement of correct white-spacing in your code. This splits people between “this is great for enforcing readability” and those of us who believe that in 2016 we shouldn’t need to fight an interpreter to get a program running because a line has one character out of place (you might guess where I fall on this issue).

Scala

Ah, Scala — of the four languages in this article, Scala is the one that leans back effortlessly against the wall with everybody admiring its type system. Running on the JVM, Scala is a mostly successful marriage of the functional and object-oriented paradigms, and it’s currently making huge strides in the financial world and companies that need to operate on very large amounts of data, often in a massively distributed fashion (such as Twitter and LinkedIn). It’s also the language that drives both Spark and Kafka.

As it runs in the JVM, it immediately gets access to the Java ecosystem for free, but it also has a wide variety of “native” libraries for handling data at scale (in particular Twitter’s Algebird and Summingbird). It also includes a very handy REPL for interactive development and analysis as in Python and R.

I’m very fond of Scala, if you can’t tell, as it includes lots of useful programming features like pattern matching and is considerably less verbose than standard Java. However, there’s often more than one way to do something in Scala, and the language advertises this as a feature. And that’s good! But given that it has a Turing-complete type system and all sorts of squiggly operators (‘/:’ for foldLeft and ‘:\’ forfoldRight), it is quite easy to open a Scala file and think you’re looking at a particularly nasty bit of Perl. A set of good practices and guidelines to follow when writing Scala is needed (Databricks’ are reasonable).

The other downside: Scala compiler is a touch slow, to the extent that it brings back the days of the classic “compiling!” XKCD strip. Still, it has the REPL, big data support, and Web-based notebooks in the form of Jupyter and Zeppelin, so I forgive a lot of its quirks.

Java

Finally, there’s always Java — unloved, forlorn, owned by a company that only seems to care about it when there’s money to be made by suing Google, and completely unfashionable. Only drones in the enterprise use Java! Yet Java could be a great fit for your big data project. Consider Hadoop MapReduce — Java. HDFS? Written in Java. Even Storm, Kafka, and Spark run on the JVM (in Clojure and Scala), meaning that Java is a first-class citizen of these projects. Then there are new technologies like Google Cloud Dataflow (now Apache Beam), which until very recently supported Java only.

Java may not be the ninja rock star language of choice. But while they’re straining to sort out their nest of callbacks in their Node.js application, using Java gives you access to a large ecosystem of profilers, debuggers, monitoring tools, libraries for enterprise security and interoperability, and much more besides, most of which have been battle-tested over the past two decades. (I’m sorry, everybody; Java turns 21 this year and we are all old.)

The main complaints against Java are the heavy verbosity and the lack of a REPL (present in R, Python, and Scala) for iterative developing. I’ve seen 10 lines of Scala-based Spark code balloon into a 200-line monstrosity in Java, complete with huge type statements that take up most of the screen. However, the new lambda support in Java 8 does a lot to rectify this situation. Java is never going to be as compact as Scala, but Java 8 really does make developing in Java less painful.

As for the REPL? OK, you got me there — currently, anyhow. Java 9 (out next year) will include JShell for all your REPL needs.

Drumroll, please

Which language should you use for your big data project? I’m afraid I’m going to take the coward’s way out and come down firmly on the side of “it depends.” If you’re doing heavy data analysis with obscure statistical calculations, then you’d be crazy not to favor R. If you’re doing NLP or intensive neural network processing across GPUs, then Python is a good bet. And for a hardened, production streaming solution with all the important operational tooling, Java or Scala are definitely great choices.

Of course, it doesn’t have to be either/or. For example, with Spark, you can train your model and machine learning pipeline with R or Python with data at rest, then serialize that pipeline out to storage, where it can be used by your production Scala Spark Streaming application. While you shouldn’t go overboard (your team will quickly suffer language fatigue otherwise), using a heterogeneous set of languages that play to particular strengths can bring dividends to a big data project.

 

[Source:- JW]