REQUEST COMPLIMENTARY SQLS*PLUS LICENCE
FrankenQueries: when SQL and NoSQL collide
Preamble
IBM pureXML, a proprietary XML database built on a relational mechanism (designed for puns) that offers both relational ( SQL / XML ) and unstructured ( XQuery ) query languages, and MarkLogic, a database built from scratch on the basis of a new database paradigm (call it NoSQL if you want) that understands unstructured data and offers unstructured query language ( XQuery ).
Another important information is the new trend among NoSQL database providers for implementing SQL (or SQL-like interfaces). An example is the recent promotion of Cassandra with CQL or even more mature SQL interfaces based on Hadoop.
When two worlds collide
NoSQL about No SQL . For me, this means shifting the emphasis to non-relational database alternatives, which may even explore different interfaces to the database (and do not care about political correctness). This is a good thing! Blindly admitting the weakness of SQL for business? Well, even if SQL is the right choice for your product, you still have to think about the consequences and make sure that things are well aligned between the two worlds. In other words, it means removing the “blind” part and reducing “lame” to an acceptable minimum for your developers.
Data Model
In relational you have:
RowSet -> SQL -> RowSet
RowSet is something like that:
RowSet -> Item+
Item -> INT | VARCHAR n | ...
I will tell you about the XPath data model:
XDM -> XPath/XQuery -> XDM
And XDM is something like that:
XDM -> Item+
Item -> AtomicType | Tree
AtomicType -> integer | string | ...
...
(Both are simplified, but serve a purpose) .
A distinctive feature of the data model for the document is that the trees are not flat:
{
"namespace": "person-2.0",
"comments": "This guy asked me for a dinosaur sticker. What a nutter!",
"person": {
"handle": "dscape",
"comments": "Please do not send unsolicited mail."
}
}
Thus, there are many interpretations of what this can mean:
SELECT comments from PERSON where handle = "dscape"
Which element of “comment” does the request refer to? If you look at SQL / XML, your query will look like this:
SELECT XMLQuery('$person/comments')
FROM PERSON
WHERE XMLExists('$person/person/handle')
This leads to this obvious conclusion: trees need a way to navigate. In XML it is XPath, in JSON it might be JSONSelect, maybe something else. But you still need the standard navigation method in the first place.
What makes this task even more interesting is version control and circuit development. Despite the fact that this has been ignored for ages in the relational world (with serious consequences for business due to downtime in these funny moments of table changes). , this is indeed not to be ignored for documents. Think about Microsoft Word – how many different versions of documents do they support? Word 2003, 2005, etc.
No schema, flexibility, unstructured: choose your word, but they are all subject to the rapid evolution of data formats. In this query, we assume that the descriptor is a human child and that the comments that I am an idiot are a direct descendant of the tree. This will certainly change. And SQL does not support the versioning of documents, so you will have to extend it to make it work.
The real query language for unstructured data must take the version into account. In XQuery we can express this query as something like that:
declare namespace p = "person-2.0" ;
for $person in collection('person')
let $comments-on-person := $person/p:comments
where $person/p:handle = "dscape"
return $comments-on-person
Frankenqueries, for example
Someone once mentioned me (talking about SQL / XML) as these Frankenqueries. The term has stuck to my head so far. Let’s look a little further at this analogy and look for places where organic parts and bolts come together.
Let’s present two shopping lists, one for Joe and one for Mary.
marys-shopping.json
{"fruit": {
"apples": 2
}, "apples": 5 }
joes-shopping.json
{"fruit": {
"apples": 6,
"oranges": 1
} }
Now with my “imaginary” SQL / JSON extension, I do:
SELECT apples
FROM LISTS
What does it return? Remember, RowSet goes in, RowSet goes out?
2, 5
---
6
Thus, even if you explicitly request a list of apple numbers, you get two sets of rows instead of three, and one of the sets of rows will have a list of numbers. If you choose to return three things instead, you will get two RowSet sets and three RowSet sets. I am not a mathematician, but that does not sound good.
Once again, it is not a problem if you use something that might deal with unstructured information. You don’t have this problem in javascript and of course, it won’t be in XQuery. Both in javascript and in XQuery it is all organic.
Conclusion: stunning languages for unstructured data, unicorns, and pixie dust!
Although XQuery is an excellent language for unstructured information, my point of view here does not protect its use. The point I’m trying to emphasize is the need for a real language for unstructured data, no matter how you (read: developers) choose it.
But I ask you (the developers) not to take back the “lame SQL”. She’s gone, and you have a new hot date called NoSQL. Just give it some time and it will grow on you. It’s also very fun to write JavaScript code that works in databases: don’t let them take it away from you.
SQL for unstructured data will fail. Then PL-SQL for unstructured data will fail. So if the vendor insists on what you need, don’t accept anything less than a full programming language: you can write your complete application in javascript and save it in CouchApp, or you can write your complete application in XQuery and save it in MarkLogic. And so it should be!
Here is a checklist of what to look for in the query language for unstructured information:
- The language of the navigation
- Data Model
- Normal expressions
- Lambda
- Functions of high order
- Functional fragrance
- Good line processing
- Modules so that you can create your own libraries
- The application server is aware: has functions that serve REST
You may ignore this advice, but in the end, you may feel frustrated with the Silverlight developer. And we, the guys who like to innovate in databases, will be disappointed that the developers decided to go back!
SQL vs NoSQL Explained
Enteros
About Enteros
Enteros offers a patented database performance management SaaS platform. It proactively identifies root causes of complex business-impacting database scalability and performance issues across a growing number of RDBMS, NoSQL, and machine learning database platforms.
MORE NEWS
PreambleNoSql is not a replacement for SQL databases but is a valid alternative for many situations where standard SQL is not the best approach for...
PreambleMongoDB Conditional operators specify a condition to which the value of the document field shall correspond.Comparison Query Operators $eq...
5 Database management trends impacting database administrationIn the realm of database management systems, moreover half (52%) of your competitors feel...
The data type is defined as the type of data that any column or variable can store in MS SQL Server. What is the data type? When you create any table or...
PreambleMS SQL Server is a client-server architecture. MS SQL Server process starts with the client application sending a query.SQL Server accepts,...
First the basics: what is the master/slave?One database server (“master”) responds and can do anything. A lot of other database servers store copies of all...
PreambleAtom Hopper (based on Apache Abdera) for those who may not know is an open-source project sponsored by Rackspace. Today we will figure out how to...
PreambleMongoDB recently introduced its new aggregation structure. This structure provides a simpler solution for calculating aggregated values rather...
FlexibilityOne of the most advertised features of MongoDB is its flexibility. Flexibility, however, is a double-edged sword. More flexibility means more...
PreambleSQLShell is a cross-platform command-line tool for SQL, similar to psql for PostgreSQL or MySQL command-line tool for MySQL.Why use it?If you...
PreambleWriting an application on top of the framework on top of the driver on top of the database is a bit like a game on the phone: you say “insert...
PreambleOracle Coherence is a distributed cache that is functionally comparable with Memcached. In addition to the basic function of the API cache, it...
What is PostgreSQL array? In PostgreSQL we can define a column as an array of valid data types. The data type can be built-in, custom or enumerated....
PreambleIf you are a Linux sysadmin or developer, there comes a time when you need to manage an Oracle database that can work in your environment.In this...
PreambleStarting with Microsoft SQL Server 2008, by default, the group of local administrators is no longer added to SQL Server administrators during the...
PreambleWhen administering PostgreSQL database servers, one of the most common tasks you will probably perform is enumerating databases and their tables....