If you have the Content, then Apache has the Technology! A whistle-stop tour of the Apache content related projects
|
|
|
- Gabriel Lyons
- 10 years ago
- Views:
Transcription
1 If you have the Content, then Apache has the Technology! A whistle-stop tour of the Apache content related projects
2 Nick Burch CTO Quanticate
3 Apache Projects 154 Top Level Projects 33 Incubating Projects 46 Content Related Projects 8 Content Related Incubating Projects (that excludes another ~30 fringe ones!)
4 Picking the most interesting ones 36 Projects in 45 minutes With time for questions... This is not a comprehensive guide!
5 Active Committer - ~3 of these projects Committer - ~6 of these projects User - ~12 of these projects Interested - ~24 of these projects My experience levels / knowledge will vary from project to project!
6 Different Technologies Transforming and Reading Text and Language Analysis RDF and Structured Data Management and Processing Serving Content Hosted Content But not: Storing Content
7 What can we get in 45 mins? A quick overview of each project Roughly how they fit together / cluster into related areas When talks on the project are happening at ApacheCon The project's URL, so you can look them up and find out more! What interests me in the project
8 Transforming and Reading Content
9 Apache PDFBox Read, Write, Create and Edit PDFs Create PDFs from text Fill in PDF forms Extract text and formatting (Lucene, Tika etc) Edit existing files, add images, add text etc Continues to improve with each release!
10 Apache POI File format reader and writer for Microsoft office file formats Support binary & ooxml formats Strong read edit write for.xls &.xlsx Read and basic edit for.doc &.docx Read and basic edit for.ppt &.pptx Read for Visio, Publisher, Outlook Continues growing/improving with time
11 ODF Toolkit (Incubating) File format reader and writer for ODF (Open Document Format) files A bit like Apache POI for ODF ODFDOM Low level DOM interface for ODF Files Simple API High level interface for working with ODF Files ODF Validator Pure java validator
12 Apache Tika Talks Tuesday + Wednesday Java (+app +server +OSGi) library for detecting and extracting content Identifies what a blob of content is Gives you consistent, structured metadata back for it Parses the contents into plain text, HTML, XHTML or sax events Growing fast!
13 Apache Cocoon Component Pipeline framework Plug together Lego-Like generators, transformers and serialisers Generate your content once in your application, serve to different formats Read in formats, translate and publish Can power your own Yahoo Pipes Modular, powerful and easy
14 Apache Xalan XSLT processor XPath engine Java and C++ flavours Cross platform Library and command line executables Transform your XML Fast and reliable XSLT transformation engine Project rebooted in 2014!
15 Apache XML Graphics: FOP XSL-FO processor in Java Reads W3C XSL-FO, applies the formatting rules to your XML document, and renders it Output to Text, PS, PDF, SVG, RTF, Java Graphics2D etc Lets you leave your XML clean, and define semantically meaningful rich rendering rules for it
16 Apache Commons: Codec Encode and decode a variety of encoding formats Base64, Base32, Hex, Binary Strings Digest crypt(3) password hashes Caverphone, Metaphone, Soundex Quoted Printable, URL Encoding Handy when interchanging content with external systems
17 Apache Commons: Compress Standard way to deal with archive and compression formats Read and write support zip, tar, gzip, bzip, ar, cpio, unix dump, XZ, Pack200, 7z, arj, lzma, snappy, Z Wider range of capabilities than java.util.zip Common API across all formats
18 Apache Commons: Imaging Used to be called Commons Sanselan Pure Java image reader and writer Fast parsing of image metadata and information (size, color space, icc etc) Much easier to use than ImageIO Slower though, as pure Java Wider range of formats supported PNG, GIF, TIFF, JPEG + Exif, BMP, ICO, PNM, PPM, PSD, XMP
19 Apache SIS Spatial Information System Java library for working with geospatial content Enables geographic content searching, clustering and archiving Supports co-ordination conversions Implements GeoAPI 3.0, uses ISO ISO ISO-19111
20 Text and Language Analysis Turing Content into Data
21 Apache UIMA Unstructured Information analysis Lets you build a tool to extract information from unstructured data Language Identification, Segmentation, Sentences, Enties etc Components in C++ and Java Network enabled can spread work out across a cluster Helped IBM to win Jeopardy!
22 Apache OpenNLP Natural Language Processing Various tools for sentence detection, tokenization, tagging, chunking, entity detection etc Maximum Entropy and Perception Based machine learning OpenNLP good when integrating NLP into your own solution UIMA wins for OOTB whole-solution
23 Apache ctakes Clinical Text Analysis and Knowledge Extraction System ctakes NLP system for information extraction from clinical records free text in EMR Identifies named entities from various dictionaries, eg diseases, procedues Does subject, content, ontology mappings, relations and severity Built on UIMA and OpenNLP
24 Apache Mahout Scalable Machine Learning Library Large variety of scalable, distributed algorithms Clustering find similar content Classification analyse and group Recommendations Formerly Hadoop based, now moving to a DSL based on Apache Spark
25 RDF, Structured and Linked Data Track on Wednesday
26 Apache Any 23 Anything To Tripples Library, Web Service and CLI Tool Extracts structured data from many input formats RDF / RDFa / HTML with Microformats or Microdata, JSON-LD, CSV To RDF, JSON, Turtle, N-Triples, N- Quads, XML
27 Apache Blur Search engine for massive amounts of structured data at high speed Query rich, structured data model US Census example: show me all of the people in the US who were born in Alaska between 1940 and 1970 who are now living in Kansas. Maybe? Content Classify Search Built on Apache Hadoop
28 Apache Stanbol Set of re-usable components for semantic content management Components offer RESTful APIs Can add semantic services on top of existing content management systems Content Enhancement reasoning to add semantic information to content Reasoning add more semantic data Storage, Ontologies, Data Models etc
29 Apache Clerezza For management of semantically linked data available via REST Service platform based on OSGi Makes it easy to build semantic web applications and RESTful services Fetch, store and query linked data SPARQL and RDF Graph API Renderlets for custom output
30 Apache Jena Java framework for building Linked Data and Semantic Web applications High performance Tripple Store Exposes as SPARQL http endpoint Run local, remote and federated SPARQL queries over RDF data Ontology API to add extra semantics Inference API derive additional data
31 Apache Marmotta Open source Linked Data Platform W3C Linked Data Platform (LDP) Read-Write Linked Data RDF Tripple Store with transactions, versioning and rule based reasoning SPARQL, LDP and LDPath queries Caching and security Builds on Apache Stanbol and Solr
32 Data Management and Processing
33 Apache Calcite (Incubating) Formerly known as Optiq Dynamic Data Management framework Highly customisable engine for planning and parsing queries on data from a wide variety of formats SQL interface for data not in relational databases, with query optimisation Complementary to Hadoop and NoSQL systems, esp. combinations of them
34 Apache MRQL (miracle) Large scale, distributed data analysis system, built on Hadoop, Hama, Spark Query processing and optimisation SQL-like query for data analysis Works on raw data in-situ, such as XML, JSON, binary files, CSV Powerful query constructs avoid the need to write MapReduce code Write data analysis tasks as SQL-like
35 Apache DataFu (Incubating) Collection of libraries for working with large-scale data in Hadoop, for data mining, statistics etc Provides Map-Reduce jobs and high level language functions for data analysis, eg statistics calculations Incremental processing with Hadoop with sliding data, eg computing daily and weekly statistics
36 Apache Falcon (Incubating) Data management and processing framework built on Hadoop Quickly onboard data + its processing into a Hadoop based system Declarative definition of data endpoints and processing rules, inc dependencies Orchestrates data pipelines, management, lifecycle, motion etc
37 Apache Ignite (Incubating) Formerly known as GainGrid Only just entered incubation In-Memory data fabric High performance, distributed data management between heterogeneous data sources and user applications Stream processing and compute grid Structured and unstructured data
38 Serving up your Content
39 Apache HTTPD Server Talks All day today Very wide range of features (Fairly) easy to extend Can host most programming languages Can front most content systems Can proxy your content applications Can host code and content
40 Apache TrafficServer High performance web proxy Forward and reverse proxy Ideally suited to sitting between your content application and the internet For proxy-only use cases, will probably be better than httpd Fewer other features though Often used as a cloud-edge http router
41 Apache Tomcat Talks Tuesday Java based, as many of the Apache Content Technologies are Java Servlet Container And you probably all know the rest!
42 Apache Usergrid (Incubating) Backend-as-a-Service Baas mbaas Distributed NoSQL database + asset storage Mobile and server-side SDKs Rapidly build mobile and/or web applications, inc content driven ones Provides key services, eg users, queues, storage, queries etc
43 Generating Content
44 Apache OpenOffice Tracks Tuesday and Wednesday Apache Licensed way to create, read and write your documents and content Our first big Consumer Focused project Can be used directly Or can be used as the upstream for other applications
45 Apache Forrest Document rendering solution build on top of cocoon Reads in content in a variety of formats (xml, wiki etc), applies the appropriate formatting rules, then outputs to different formats Heavily used for documentation and websites eg read in a file, format as changelog and readme, output as html + pdf
46 Apache Abdera Atom syndication and publishing High performance Java implementation of RFC Generate Atom feeds from Java or by converting Parse and process Atom feeds Atompub server and clients Supports Atom extensions like GeoRSS, MediaRSS & OpenSearch
47 Apache JSPWiki Feature-rich extensible wiki Written in Java (Servlets + JSP) Fairly easy to extend Can be used as a wiki out of the box Provides a good platform for new wiki based application Rich wiki markup and syntax Attachments, security, templates etc
48 Working with Hosted Content
49 Apache Chemistry Java, Python,.net, PHP, Mobile Atom, W*, Browser (JSON) interfaces OASIS CMIS (Content Management Interoperability Services) Client and Server bindings SQL for Content Consistent view on content across different repositories Read / Write / Manipulate content
50 Apache ManifoldCF Name has changed a few times... (Lucene/Apache Connectors) Provides a standard way to get content out of other systems, ready for sending to Lucene etc Different goals to CMIS (Chemistry) Uses many parsers and libraries to talk to the different repositories / systems Analogous to Tika but for repos
51 Chemistry vs ManifoldCF incubator /chemistry/ /connectors/ ManifoldCF treats repo as nasty black box, and handles talking to the parsers Chemistry talks / exposes repo's contents through CMIS ManifoldCF supports a wider range of repositories Chemistry supports read and write Chemistry delivers a richer model ManifoldCF great for getting text out
52 Any Questions? Any cool projects that I happened to miss?
The other Apache Technologies your Big Data solution needs! Nick Burch CTO, Quanticate
The other Apache Technologies your Big Data solution needs! Nick Burch CTO, Quanticate Nick Burch CTO, Quanticate Copyright Quanticate 2013 The Apache Software Foundation Apache Technologies as in the
Natural Language Processing in the EHR Lifecycle
Insight Driven Health Natural Language Processing in the EHR Lifecycle Cecil O. Lynch, MD, MS [email protected] Health & Public Service Outline Medical Data Landscape Value Proposition of NLP
How To Run Apa Tika On A Microsoft Macbook Or Ipa.Net (For Linux) Or Ipad (For Windows) (For Macbook) (Or Ipa) (On Linux) (Minor) (Large
What's with all the 1s and 0s? Making sense of binary data at scale with Apache Tika Nick Burch CTO, Quanticate Those 1s and 0s Apache Tika the basics Detection Binary formats Text formats Extending Tika
San Jose State University
San Jose State University Fall 2011 CMPE 272: Enterprise Software Overview Project: Date: 5/9/2011 Under guidance of Professor, Rakesh Ranjan Submitted by, Team Titans Jaydeep Patel (007521007) Zankhana
K@ A collaborative platform for knowledge management
White Paper K@ A collaborative platform for knowledge management Quinary SpA www.quinary.com via Pietrasanta 14 20141 Milano Italia t +39 02 3090 1500 f +39 02 3090 1501 Copyright 2004 Quinary SpA Index
Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies
Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies Big Data: Global Digital Data Growth Growing leaps and bounds by 40+% Year over Year! 2009 =.8 Zetabytes =.08
Web Search by the people, for the people Michael Christen, [email protected], http://yacy.net
Web by the people, for the people, [email protected], RMLL 2011 Rencontres Mondiales du Logiciel Libre http://2011.rmll.info Topics What is a decentralized search engine? and why would you use that Architecture
How To Manage Your Digital Assets On A Computer Or Tablet Device
In This Presentation: What are DAMS? Terms Why use DAMS? DAMS vs. CMS How do DAMS work? Key functions of DAMS DAMS and records management DAMS and DIRKS Examples of DAMS Questions Resources What are DAMS?
How Xena performs file format identification
How Xena performs file format identification Version 1.0 RKS: 2009/4024 Document Change Record Version Changed By Description of Changes Change Date 0.1 Allan Cunliffe Created March 2011 0.2 Allan Cunliffe
Oracle Database 11g Comparison Chart
Key Feature Summary Express 10g Standard One Standard Enterprise Maximum 1 CPU 2 Sockets 4 Sockets No Limit RAM 1GB OS Max OS Max OS Max Database Size 4GB No Limit No Limit No Limit Windows Linux Unix
The basic data mining algorithms introduced may be enhanced in a number of ways.
DATA MINING TECHNOLOGIES AND IMPLEMENTATIONS The basic data mining algorithms introduced may be enhanced in a number of ways. Data mining algorithms have traditionally assumed data is memory resident,
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing
Epimorphics Linked Data Publishing Platform
Epimorphics Linked Data Publishing Platform Epimorphics Services for G-Cloud Version 1.2 15 th December 2014 Authors: Contributors: Review: Andy Seaborne, Martin Merry Dave Reynolds Epimorphics Ltd, 2013
The Open Source Knowledge Discovery and Document Analysis Platform
Enabling Agile Intelligence through Open Analytics The Open Source Knowledge Discovery and Document Analysis Platform 17/10/2012 1 Agenda Introduction and Agenda Problem Definition Knowledge Discovery
XML Processing and Web Services. Chapter 17
XML Processing and Web Services Chapter 17 Textbook to be published by Pearson Ed 2015 in early Pearson 2014 Fundamentals of http://www.funwebdev.com Web Development Objectives 1 XML Overview 2 XML Processing
Web. Services. Web Technologies. Today. Web. Technologies. Internet WWW. Protocols TCP/IP HTTP. Apache. Next Time. Lecture #3 2008 3 Apache.
JSP, and JSP, and JSP, and 1 2 Lecture #3 2008 3 JSP, and JSP, and Markup & presentation (HTML, XHTML, CSS etc) Data storage & access (JDBC, XML etc) Network & application protocols (, etc) Programming
Client Overview. Engagement Situation. Key Requirements
Client Overview Our client is one of the leading providers of business intelligence systems for customers especially in BFSI space that needs intensive data analysis of huge amounts of data for their decision
VeraPDF: Building the definitive PDF/A validator The European Union s PREFORMA project
VeraPDF: Building the definitive PDF/A validator The European Union s PREFORMA project Boris Doubrov, Dual Lab and Duff Johnson, PDF Association on behalf of the verapdf consortium slightly changed by
Lightweight Data Integration using the WebComposition Data Grid Service
Lightweight Data Integration using the WebComposition Data Grid Service Ralph Sommermeier 1, Andreas Heil 2, Martin Gaedke 1 1 Chemnitz University of Technology, Faculty of Computer Science, Distributed
Structured Content: the Key to Agile. Web Experience Management. Introduction
Structured Content: the Key to Agile CONTENTS Introduction....................... 1 Structured Content Defined...2 Structured Content is Intelligent...2 Structured Content and Customer Experience...3 Structured
Developing XML Solutions with JavaServer Pages Technology
Developing XML Solutions with JavaServer Pages Technology XML (extensible Markup Language) is a set of syntax rules and guidelines for defining text-based markup languages. XML languages have a number
Semantic Stored Procedures Programming Environment and performance analysis
Semantic Stored Procedures Programming Environment and performance analysis Marjan Efremov 1, Vladimir Zdraveski 2, Petar Ristoski 2, Dimitar Trajanov 2 1 Open Mind Solutions Skopje, bul. Kliment Ohridski
Credits: Some of the slides are based on material adapted from www.telerik.com/documents/telerik_and_ajax.pdf
1 The Web, revisited WEB 2.0 [email protected] Credits: Some of the slides are based on material adapted from www.telerik.com/documents/telerik_and_ajax.pdf 2 The old web: 1994 HTML pages (hyperlinks)
Hadoop Ecosystem B Y R A H I M A.
Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open
Content Management Systems: Drupal Vs Jahia
Content Management Systems: Drupal Vs Jahia Mrudula Talloju Department of Computing and Information Sciences Kansas State University Manhattan, KS 66502. [email protected] Abstract Content Management Systems
Semantic Content Management with Apache Stanbol
Semantic Content Management with Apache Stanbol Ali Anil SINACI and Suat GONUL SRDC Software Research & Development and Consultancy Ltd., ODTU Teknokent Silikon Blok No:14, 06800 Ankara, Turkey {anil,suat}@srdc.com.tr
Geospatial Platforms For Enabling Workflows
Geospatial Platforms For Enabling Workflows Steven Hagan Vice President Oracle Database Server Technologies November, 2015 Evolution of Enabling Workflows HENRY FORD 100 YEARS AGO Industrialized the Manufacturing
SOA, case Google. Faculty of technology management 07.12.2009 Information Technology Service Oriented Communications CT30A8901.
Faculty of technology management 07.12.2009 Information Technology Service Oriented Communications CT30A8901 SOA, case Google Written by: Sampo Syrjäläinen, 0337918 Jukka Hilvonen, 0337840 1 Contents 1.
Team Members: Christopher Copper Philip Eittreim Jeremiah Jekich Andrew Reisdorph. Client: Brian Krzys
Team Members: Christopher Copper Philip Eittreim Jeremiah Jekich Andrew Reisdorph Client: Brian Krzys June 17, 2014 Introduction Newmont Mining is a resource extraction company with a research and development
Deploy. Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture
Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture Apps and data source extensions with APIs Future white label, embed or integrate Power BI Deploy Intelligent
Apache MRQL (incubating): Advanced Query Processing for Complex, Large-Scale Data Analysis
Apache MRQL (incubating): Advanced Query Processing for Complex, Large-Scale Data Analysis Leonidas Fegaras University of Texas at Arlington http://mrql.incubator.apache.org/ 04/12/2015 Outline Who am
JISIS and Web Technologies
27 November 2012 Status: Draft Author: Jean-Claude Dauphin JISIS and Web Technologies I. Introduction This document does aspire to explain how J-ISIS is related to Web technologies and how to use J-ISIS
City Data Pipeline. A System for Making Open Data Useful for Cities. [email protected]
City Data Pipeline A System for Making Open Data Useful for Cities Stefan Bischof 1,2, Axel Polleres 1, and Simon Sperl 1 1 Siemens AG Österreich, Siemensstraße 90, 1211 Vienna, Austria {bischof.stefan,axel.polleres,simon.sperl}@siemens.com
Aspose.Cells Product Family
time and effort by using our efficient and robust components instead of developing your own. lets you open, create, save and convert files from within your application without Microsoft Excel, confident
Open Source Technologies on Microsoft Azure
Open Source Technologies on Microsoft Azure A Survey @DChappellAssoc Copyright 2014 Chappell & Associates The Main Idea i Open source technologies are a fundamental part of Microsoft Azure The Big Questions
CatDV Pro Workgroup Serve r
Architectural Overview CatDV Pro Workgroup Server Square Box Systems Ltd May 2003 The CatDV Pro client application is a standalone desktop application, providing video logging and media cataloging capability
Tools for Web Archiving: The Java/Open Source Tools to Crawl, Access & Search the Web. NLA Gordon Mohr March 28, 2012
Tools for Web Archiving: The Java/Open Source Tools to Crawl, Access & Search the Web NLA Gordon Mohr March 28, 2012 Overview The tools: Heritrix crawler Wayback browse access Lucene/Hadoop utilities:
Efficiency of Web Based SAX XML Distributed Processing
Efficiency of Web Based SAX XML Distributed Processing R. Eggen Computer and Information Sciences Department University of North Florida Jacksonville, FL, USA A. Basic Computer and Information Sciences
Large Scale Text Analysis Using the Map/Reduce
Large Scale Text Analysis Using the Map/Reduce Hierarchy David Buttler This work is performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract
Indexing big data with Tika, Solr, and map-reduce
Indexing big data with Tika, Solr, and map-reduce Scott Fisher, Erik Hetzner California Digital Library 8 February 2012 Scott Fisher, Erik Hetzner (CDL) Indexing big data 8 February 2012 1 / 19 Outline
Outside In Image Export Technology SDK Quick Start Guide
Reference: 2009/02/06-8.3 Outside In Image Export Technology SDK Quick Start Guide This document provides an overview of the Outside In Image Export Software Developer s Kit (SDK). It includes download
An introduction to creating Web 2.0 applications in Rational Application Developer Version 8.0
An introduction to creating Web 2.0 applications in Rational Application Developer Version 8.0 September 2010 Copyright IBM Corporation 2010. 1 Overview Rational Application Developer, Version 8.0, contains
Distributed Computing and Big Data: Hadoop and MapReduce
Distributed Computing and Big Data: Hadoop and MapReduce Bill Keenan, Director Terry Heinze, Architect Thomson Reuters Research & Development Agenda R&D Overview Hadoop and MapReduce Overview Use Case:
Communiqué 4. Standardized Global Content Management. Designed for World s Leading Enterprises. Industry Leading Products & Platform
Communiqué 4 Standardized Communiqué 4 - fully implementing the JCR (JSR 170) Content Repository Standard, managing digital business information, applications and processes through the web. Communiqué
Search and Information Retrieval
Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search
Core Feature Comparison between. XML / SOA Gateways. and. Web Application Firewalls. Jason Macy [email protected] CTO, Forum Systems
Core Feature Comparison between XML / SOA Gateways and Web Application Firewalls Jason Macy [email protected] CTO, Forum Systems XML Gateway vs Competitive XML Gateways or Complementary? and s are Complementary
ARC: appmosphere RDF Classes for PHP Developers
ARC: appmosphere RDF Classes for PHP Developers Benjamin Nowack appmosphere web applications, Kruppstr. 100, 45145 Essen, Germany [email protected] Abstract. ARC is an open source collection of lightweight
FileMaker Server 9. Custom Web Publishing with PHP
FileMaker Server 9 Custom Web Publishing with PHP 2007 FileMaker, Inc. All Rights Reserved. FileMaker, Inc. 5201 Patrick Henry Drive Santa Clara, California 95054 FileMaker is a trademark of FileMaker,
Cloud Computing with Windows Azure using your Preferred Technology
Cloud Computing with Windows Azure using your Preferred Technology Sumit Chawla Program Manager Architect Interoperability Technical Strategy Microsoft Corporation Agenda Windows Azure Platform - Windows
BIRT Document Transform
BIRT Document Transform BIRT Document Transform is the industry leader in enterprise-class, high-volume document transformation. It transforms and repurposes high-volume documents and print streams such
Quick Reference Guide
Quick Reference Guide What s New in NSi AutoStore TM 6.0 Notable Solutions, Inc. System requirements Hardware Microsoft Windows operating system (OS) running on computer with at least a 2 GHz Processor
How To Retire A Legacy System From Healthcare With A Flatirons Eas Application Retirement Solution
EAS Application Retirement Case Study: Health Insurance Introduction A major health insurance organization contracted with Flatirons Solutions to assist them in retiring a number of aged applications that
Finding the Needle in a Big Data Haystack. Wolfgang Hoschek (@whoschek) JAX 2014
Finding the Needle in a Big Data Haystack Wolfgang Hoschek (@whoschek) JAX 2014 1 About Wolfgang Software Engineer @ Cloudera Search Platform Team Previously CERN, Lawrence Berkeley National Laboratory,
Digital Asset Management Beyond CMIS
Digital Asset Management Beyond CMIS CMIS is an important component of DAM for many organizations, but knowing how to use it to maximize its effectiveness is the key. In this paper: How organizations use
HadoopRDF : A Scalable RDF Data Analysis System
HadoopRDF : A Scalable RDF Data Analysis System Yuan Tian 1, Jinhang DU 1, Haofen Wang 1, Yuan Ni 2, and Yong Yu 1 1 Shanghai Jiao Tong University, Shanghai, China {tian,dujh,whfcarter}@apex.sjtu.edu.cn
How to secure your Apache Camel deployment
How to secure your Apache Camel deployment Jonathan Anstey Principal Engineer FuseSource 1 Your Presenter is: Jonathan Anstey Principal Software Engineer at FuseSource http://fusesource.com Apache Camel
WWW. World Wide Web Aka The Internet. dr. C. P. J. Koymans. Informatics Institute Universiteit van Amsterdam. November 30, 2007
WWW World Wide Web Aka The Internet dr. C. P. J. Koymans Informatics Institute Universiteit van Amsterdam November 30, 2007 dr. C. P. J. Koymans (UvA) WWW November 30, 2007 1 / 36 WWW history (1) 1968
JAVA. EXAMPLES IN A NUTSHELL. O'REILLY 4 Beijing Cambridge Farnham Koln Paris Sebastopol Taipei Tokyo. Third Edition.
"( JAVA. EXAMPLES IN A NUTSHELL Third Edition David Flanagan O'REILLY 4 Beijing Cambridge Farnham Koln Paris Sebastopol Taipei Tokyo Table of Contents Preface xi Parti. Learning Java 1. Java Basics 3 Hello
Contents. 2 Alfresco API Version 1.0
The Alfresco API Contents The Alfresco API... 3 How does an application do work on behalf of a user?... 4 Registering your application... 4 Authorization... 4 Refreshing an access token...7 Alfresco CMIS
LINKED DATA EXPERIENCE AT MACMILLAN Building discovery services for scientific and scholarly content on top of a semantic data model
LINKED DATA EXPERIENCE AT MACMILLAN Building discovery services for scientific and scholarly content on top of a semantic data model 22 October 2014 Tony Hammond Michele Pasin Background About Macmillan
XML. CIS-3152, Spring 2013 Peter C. Chapin
XML CIS-3152, Spring 2013 Peter C. Chapin Markup Languages Plain text documents with special commands PRO Plays well with version control and other program development tools. Easy to manipulate with scripts
Querying DBpedia Using HIVE-QL
Querying DBpedia Using HIVE-QL AHMED SALAMA ISMAIL 1, HAYTHAM AL-FEEL 2, HODA M. O.MOKHTAR 3 Information Systems Department, Faculty of Computers and Information 1, 2 Fayoum University 3 Cairo University
Internet Engineering: Web Application Architecture. Ali Kamandi Sharif University of Technology [email protected] Fall 2007
Internet Engineering: Web Application Architecture Ali Kamandi Sharif University of Technology [email protected] Fall 2007 Centralized Architecture mainframe terminals terminals 2 Two Tier Application
SharePoint and LibreOffice
// Represents the model of a Writer document. class SW_DLLPUBLIC SwDoc : public IInterface, public IDocumentRedlineAccess, SharePoint and LibreOffice public IDocumentFieldsAccess, public IDocumentStylePoolAccess,
Apache Sling A REST-based Web Application Framework Carsten Ziegeler [email protected] ApacheCon NA 2014
Apache Sling A REST-based Web Application Framework Carsten Ziegeler [email protected] ApacheCon NA 2014 About [email protected] @cziegeler RnD Team at Adobe Research Switzerland Member of the Apache
10CS73:Web Programming
10CS73:Web Programming Question Bank Fundamentals of Web: 1.What is WWW? 2. What are domain names? Explain domain name conversion with diagram 3.What are the difference between web browser and web server
IBM Rational Web Developer for WebSphere Software Version 6.0
Rapidly build, test and deploy Web, Web services and Java applications with an IDE that is easy to learn and use IBM Rational Web Developer for WebSphere Software Version 6.0 Highlights Accelerate Web,
Introduction to XML Applications
EMC White Paper Introduction to XML Applications Umair Nauman Abstract: This document provides an overview of XML Applications. This is not a comprehensive guide to XML Applications and is intended for
How To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
Graph Database Performance: An Oracle Perspective
Graph Database Performance: An Oracle Perspective Xavier Lopez, Ph.D. Senior Director, Product Management 1 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Program Agenda Broad Perspective
Wiley. Automated Data Collection with R. Text Mining. A Practical Guide to Web Scraping and
Automated Data Collection with R A Practical Guide to Web Scraping and Text Mining Simon Munzert Department of Politics and Public Administration, Germany Christian Rubba University ofkonstanz, Department
Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase
Architectural patterns for building real time applications with Apache HBase Andrew Purtell Committer and PMC, Apache HBase Who am I? Distributed systems engineer Principal Architect in the Big Data Platform
Alfresco. Wiley Publishing, Inc. PROFESSIONAL. PRACTICAL SOLUTIONS FOR ENTERPRISE. John Newton CONTENT MANAGEMENT. Michael Farman Michael G.
PROFESSIONAL. Alfresco PRACTICAL SOLUTIONS FOR ENTERPRISE CONTENT MANAGEMENT David Caruana John Newton Michael Farman Michael G. Uzquiano Kevin Roast WILEY Wiley Publishing, Inc. INTRODUCTION xxix CHAPTER
Apache Tika for Enabling Metadata Interoperability
Apache Tika for Enabling Metadata Interoperability Apache: Big Data Europe September 28 30, 2015 Budapest, Hungary Presented by Michael Starch (NASA JPL) and Nick Burch (Quan,cate) Proposed by Giuseppe
Apache Lucene. Searching the Web and Everything Else. Daniel Naber Mindquarry GmbH ID 380
Apache Lucene Searching the Web and Everything Else Daniel Naber Mindquarry GmbH ID 380 AGENDA 2 > What's a search engine > Lucene Java Features Code example > Solr Features Integration > Nutch Features
Geospatial Platforms For Enabling Workflows
Geospatial Platforms For Enabling Workflows Steven Hagan Vice President Oracle Database Server Technologies May, 2015 Evolution of Enabling Workflows HENRY FORD 100 YEARS AGO Industrialized the Manufacturing
A Scalable Data Transformation Framework using the Hadoop Ecosystem
A Scalable Data Transformation Framework using the Hadoop Ecosystem Raj Nair Director Data Platform Kiru Pakkirisamy CTO AGENDA About Penton and Serendio Inc Data Processing at Penton PoC Use Case Functional
Jozef Matula. Visualisation Team Leader IBL Software Engineering. 13 th ECMWF MetOps Workshop, 31 th Oct - 4 th Nov 2011, Reading, United Kingdom
Visual Weather web services Jozef Matula Visualisation Team Leader IBL Software Engineering Outline Visual Weather in a nutshell. Path from Visual Weather (as meteorological workstation) to Web Server
Smithsonian Institution Archives Guidance Update SIA. ELECTRONIC RECORDS Recommendations for Preservation Formats. November 2004 SIA_EREC_04_03
SIA Smithsonian Institution Archives Guidance Update November 2004 ELECTRONIC RECORDS Recommendations for s SIA_EREC_04_03 Highlights Highlights of SIA_EREC_04_03, an update on electronic record preservation
Introduction to Big Data & Basic Data Analysis. Freddy Wetjen, National Library of Norway.
Introduction to Big Data & Basic Data Analysis Freddy Wetjen, National Library of Norway. Big Data EveryWhere! Lots of data may be collected and warehoused Web data, e-commerce purchases at department/
Hadoop and Map-Reduce. Swati Gore
Hadoop and Map-Reduce Swati Gore Contents Why Hadoop? Hadoop Overview Hadoop Architecture Working Description Fault Tolerance Limitations Why Map-Reduce not MPI Distributed sort Why Hadoop? Existing Data
Industry 4.0 and Big Data
Industry 4.0 and Big Data Marek Obitko, [email protected] Senior Research Engineer 03/25/2015 PUBLIC PUBLIC - 5058-CO900H 2 Background Joint work with Czech Institute of Informatics, Robotics and
Data processing goes big
Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,
Glassfish, JAVA EE, Servlets, JSP, EJB
Glassfish, JAVA EE, Servlets, JSP, EJB Java platform A Java platform comprises the JVM together with supporting class libraries. Java 2 Standard Edition (J2SE) (1999) provides core libraries for data structures,
Open For Business in a Nutshell
Open For Business in a Nutshell Open Source Foundations for Enterprise Applications Si Chen Open Source Strategies, Inc. What is OFBiz? Open Source Project for Enterprise Applications (ERP/CRM/MRP) Applications
Cocoon 2 Programming: Web Publishing with XML and Java"
Cocoon 2 Programming: Web Publishing with XML and Java" Bill Brogden Conrad D'Cruz Mark Gaither StfBEX San Francisco London Introduction xv Chapter 1 The Cocoon 2 Architecture 1 The Challenges of Web Publishing
Scope. Cognescent SBI Semantic Business Intelligence
Cognescent SBI Semantic Business Intelligence Scope...1 Conceptual Diagram...2 Datasources...3 Core Concepts...3 Resources...3 Occurrence (SPO)...4 Links...4 Statements...4 Rules...4 Types...4 Mappings...5
Mr. Apichon Witayangkurn [email protected] Department of Civil Engineering The University of Tokyo
Sensor Network Messaging Service Hive/Hadoop Mr. Apichon Witayangkurn [email protected] Department of Civil Engineering The University of Tokyo Contents 1 Introduction 2 What & Why Sensor Network
Tools for Researchers
Tools for Researchers Microsoft Research & the Scholarly Information Ecosystem Lee Dirks & Alex Wade Directors, Education & Scholarly Communication Microsoft External Research Microsoft Corporation Microsoft
Developing Web Services with Eclipse
Developing Web Services with Eclipse Arthur Ryman IBM Rational [email protected] Page Abstract The recently created Web Tools Platform Project extends Eclipse with a set of Open Source Web service development
