Dell EMC Feeds
Dell EMC Feeds RSS Feed

  • [The Source Podcast] Hyper Converged Infrastructure and the Three P’s
    EMC logo

    While Hyper-Converged Infrastructure provides the ability to simplify the deployment of cloud and application solutions, it also brings with it some unexpected considerations.

    dell emc vxrail

    Chad Engel (@Engel_Chad), Senior Sales Director Hyper-Converged, likes to consider them the three P’s: People, Process and Politics. In some cases, the three P’s can lead to a delay in adopting the benefits of a hyper-converged infrastructure. Engel has some interesting thoughts on how to bring down those barriers and embrace the operational efficiencies of Hyper-Converged.

    Sam Marraccini and Chad Engel As Dell EMC Hyper-Converged Solutions extend to include development stacks such as the VxRail AS (Azure Stack), the value prop becomes clearer, infrastructure to accelerate application and business enablement.

    Get Dell EMC The Source app in the Apple App Store or Google Play, and Subscribe to the podcast: iTunes, Stitcher Radio or Google Play.

    Dell EMC The Source Podcast is hosted by Sam Marraccini (@SamMarraccini)


    Update your feed preferences

                    submit to reddit    

  • Dell EMC PowerEdge Tops Big Data Analysis Systems Benchmark
    EMC logo

    Dell EMC Re-takes Performance Lead on TPCx-Big Bench Benchmark at Scale Factor 10000 Using 14G Hardware Platform

    Computer benchmarking is the practice of discovering, measuring and assessing the relative performance of a system for a specific workload. Industry standard benchmarks allow researchers to compare the best performance being achieved by their systems to those of another entity (competitor) or to an entirely different industry. This information can then be used to identify gaps in an organization’s performance processes to achieve a competitive advantage. It is also used to compare the performance of the various vendor offerings within the Industry.

    crowd of people standing in front of strings of blue lights

    Dell EMC uses industry-standard benchmark tests to provide objective and verifiable performance data to its customers as it relates to its hardware platforms and solutions. Customers can use this data when designing and sizing a solution to deliver optimal performance for their business use cases and to scale as needed.

    As big data analysis systems (BDAS) mature, the pressure to evaluate and compare both the performance and the price performance of these systems rises. To address this, the Transaction Processing Performance Council (TPC) developed the TPCx-Big Bench (TPCx-BB), a benchmark standard for comparing BDAS solutions. Detailed information about the TPCx-BB specification can be found at

    On March 22, 2018, Dell EMC published a SF10000 TPCx-BB benchmark result that occupies the number 1 position for both the Performance and Price/Performance tables on the TPC website. With this result, and as at that date, Dell EMC topped the Performance and Price/Performance tables of the TPCx-BB benchmark at SF10000. This is the 2nd result from Dell EMC following the pioneering result it published in the summer of 2017. There are two other results published at SF10000, both from HPE as shown on this page

    System Under Test

    For this submission, the System Under Test (SUT) comprised 1 x DELL EMC PowerEdge R640 Server (Master node) and 18 x Dell EMC PowerEdge R740xd Servers (Slave nodes) as depicted in the System Configuration diagram below:

    System Configuration illustration

    What was tested and measured?

    The benchmark is designed to stress the CPU and IO systems of a Big Data Cluster using a single stream (power test) and concurrent streams (throughput test) of 30 queries (workloads).  Each run is performed under 3 phases: Load, Power and Throughput. For the test to pass an audit, 2 sequential performance runs must be executed.


    The overall TPCx-BB performance data for the Dell EMC R640/R740xd configuration is summarized in the table below:

    Load Test 2,190.66s
    Power Test 18,949.43s
    Throughput Test 70,134.07s
    Performance Metric 1,660.75 BBQpm@SF10000
    Total System Cost $908,125
    Price/Performance 546.82 $/BBQpm@SF10000
    Availability Date March 20, 2018

    Detailed results and documentation are available at the TPC website.

    Competitive Landscape

    The TPCx-BB benchmark standard has published results on SF1000, 3000, 10000 and 30000. Dell EMC has one SF10000 published result submitted in May, 2017 based on 13G hardware (Intel Xeon Broadwell based processors). As of April 04, 2018, only Dell EMC and HPE have published results at SF10000 based on servers with Intel Xeon Skylake processors. The published results for SF3000 and 1000 from HPE and Huawei are based on Intel Broadwell and Haswell family of processors. The only result the Dell EMC 14G performance data can realistically be compared to is the latest HPE SF10000 result:

    two bar charts showing performance and price comparisons


    Ever since the release of the TPCx-Big Bench benchmark standard in December 2015, Dell EMC has been an active member of the TPCx-Big Bench Technical Committee and will continue to collaborate with other stakeholders within the industry to maintain the TPCx-BB specification.

    Dell EMC uses the TPCx-BB workloads to compare and differentiate its data analytics solutions and will additionally use them for performance characterizations and optimizations for the benefit of its customers.

    Industry standard computer benchmarking should not be viewed as a purely academic exercise or a competition among vendors.  Studying the design of top performing systems has significant benefits for customers that now, with the industry collaboration on the TPCx-BB benchmark, can advance the state of the art of performance engineering for Data Analytics.


    Update your feed preferences

                    submit to reddit    

  • Building the Platform for Data Capital
    EMC logo

    In my last post, we introduced the concept of Data Capital and the impact it is having on every industry. Organizations everywhere have embraced the reality that data is the most valuable asset and should be supported as such. Prioritizing data as an asset is a dramatic departure from how most organizations have previously treated data.

    When I think about what our customers are trying to accomplish, it reinforces how critical our job is as a steward for their data. Many are still early in their transformation, and even the most advanced organizations won’t have a clear line of sight to where their data needs will be years out. As I’ve previously noted data doesn’t have an expiration date so there’s a need to support the future needs that cannot be anticipated, while extending the support for the existing data collected over time. This requires a data platform that offers longevity and scalability while continuing to deliver the innovations necessary to meet new business needs.

    Nowhere are these challenges more pronounced than with unstructured data; which is information that doesn’t have a pre-defined data model or organization and can come from many sources including documents, video, social media, images, or devices. There are three common forms of unstructured data: file, object, or streaming data each has its own characteristics and uses.

    Most organizations now know the potential their data has, but they’re often weighed down with legacy infrastructure that makes it difficult to tap into all that data. Unlike structured data that resides in a database, these file types are elusive to extract value from; this is where we’ve set our sights to solve these problems and help accelerate Digital Transformation for our customers.

    Start With the Unified Data Lake

    For many vendors, consolidation means re-platforming of legacy applications and a radical shift that often requires disruptive migration – often to the cloud. We see the unified data lake as a way of eliminating data silos that exist within an organization while providing support that spans the entire data footprint from edge to core to the cloud. With solutions that account for all the unstructured data types and provide a foundation for the existing legacy applications, the business relies upon as well as new cloud-native apps.

     Get Simplicity at Scale

    When we begin to talk about the massive scale that is necessary to leverage Data Capital across an enterprise, the data platform must be able to scale performance and capacity on-demand to keep up with the data proliferation. Even when organizations can keep up on the infrastructure side, the management of a global data footprint can quickly tie up resources and lead to inefficiencies. We deliver control that scales into Exabyte range, with simple policy-driven data automation, allowing organizations to focus on extracting value from their data – instead of just managing it.

    Extract Value from Your Data

    Some would see data analytics as the sole way to create value from data, but this is a narrow definition. When we talk about Data Capital, our focus is on enabling new digital experiences, bringing efficiency to workflows, as well as the insights that can be uncovered from data analytics. It’s important to see the extent of the value that can be unlocked here; take media and entertainment, for example, on-demand viewing has opened up new revenue streams and created strong competitive differentiation to organizations that led the charge into digital content delivery. We built a platform that can deliver not just insights, but the digital experiences themselves, to enable our customers to get the most value from their data.

    Make it Real at Dell Technologies World

    If you’re eager to begin your journey toward unlocking your Data Capital, then I hope you’ll join us at Dell Technologies World in Las Vegas this April 30-May 3. We’ll have a series of informative breakout sessions focused around planning for the future of your business, maximizing performance from your storage solution, and optimizing for extreme scalability. You can also take the products for a test drive at the hands-on labs; learn about Data Capital in action – hear about customer successes at the solution expo booth; and have 1:1 meetings with our experts and architects throughout the show.

    If you’re unable to see us there, you can still follow the action: @DellEMC, @DellTechWorld, #DellTechWorld.

    And be sure to visit: to learn more about how you can begin your journey.


    Update your feed preferences

                    submit to reddit    

  • [The Source Podcast] A Canadian Approach to Hybrid Cloud and EUC
    EMC logo

    With Cloud data center providers embracing Canadian soil, the adoption of hybrid cloud strategies in Canada has increased considerably. Or as Peter Near (@pnear), national director of Systems Engineering VMWare Canada puts it, “Canadians have gone from pumping the brakes to pressing the accelerator,” regarding hybrid cloud adoption.

    Photo by Zia Syed on Unsplash

    In this episode of Dell EMC The Source Podcast, Near explains the benefits to those Canadian customers that needed to wait for local cloud data centers, and the maturity of the tools and processes available to make the move. No longer are we talking about converting and re-platforming. We are talking about taking what we are doing today, modernizing what we are doing in our private cloud and easily integrating with public cloud, and simplifying the ability to move workloads as requirements change.

    For all my Canadian friends, be sure to follow @VMWareCanada and @DellEMC_Canada for the latest.

    The Source Podcast episode 126
    Get Dell EMC The Source app in the Apple App Store or Google Play, and Subscribe to the podcast: iTunes, Stitcher Radio or Google Play.

    Dell EMC The Source Podcast is hosted by Sam Marraccini (@SamMarraccini)



    Update your feed preferences

                    submit to reddit    

  • Dell EMC President’s Circle 2017 – That’s a Wrap!
    EMC logo

    Co-authored by Joyce Mullen president, Global Channel, OEM & IoT, Dell EMC and Jay Snyder, senior vice president, Global Alliances, Dell EMC

    The Dell EMC President’s Circle recognizes top partners who have delivered exceptional results over the course of a year. As we announced at Global Partner Summit 2017, our winners are rewarded with an extravagant grand prize trip for two to the luxurious Gleneagles® Hotel in Scotland.

    Prestigious Scottish five star hotel and golf resort of Gleneagles, Perthshire, Scotland

    Spanning our entire partner ecosystem—Solution Providers, Distributors, OEM and Global Alliances—we have tracked various financial metrics over the course of the year … And the results ARE IN!

    Please join us in congratulating the following partners:

    Solution Provider and Distributor

    Ahead LLC (USA)

    Atea A/S (Denmark)

    Axians (France)

    CDI (USA)

    CDW (USA)

    Computacenter EMEA (Pan EMEA)

    Comtek (Russia)

    Data Blue (USA)

    Data Solution (Korea)

    Data#3 (Australia & New Zealand)

    Hilal Computers (Middle East)

    Ingram Micro (USA)

    Insight (USA)

    Itochu Techno-Solutions Corporation (Japan)

    ITPunkt (Central & Eastern Europe)

    LTA RH Informatica (Brazil)

    Maticmind S.p.A (Italy)

    NCS Pte Ltd (Singapore)

    Presidio (USA)

    RoundTower Technologies (USA)

    Sapta Tunas Teknologi (Indonesia)

    SHI International (USA)

    Systech Sistemas E Tecnologica (Brazil)

    Team Computers (India)

    Tech Data (USA)

    Trustnet Mexico (Mexico)

    Unitech-Rio Comercia E Servicios (Brazil)

    Worldwide Technology (United Kingdom)

    Worldwide Technology (USA)

    Zones (USA)


    Arrow Intelligent Systems (Americas)

    M2M (EMEA)


    Global Partner of the Year           HCL Technologies (India)

    Americas Partner of the Year      IBM (USA)

    EMEA Partner of the Year            Vodafone (United Kingdom)

    APJ Partner of the Year                 Dimension Data (South Africa)

    SI Partner of the Year                     TCS (TATA)

    SO Partner of the Year                   DXC

    CSP/Telco Partner of the Year    Swisscom

    Industries Partner of the Year    Schlumberger SIS

    We want to personally thank these partners for their extraordinary efforts and commitment to the Dell EMC Partner Program over the past year. Thanks to our tight partnership, you continue to delight our customers as they continue on their digital transformation journey. With you by our side, we are unstoppable. We have so many top notch partners that were in the running during this extremely tight race, and even more that will be formidable forces in future contests. To all of you, we say thank you.

    If your name isn’t on this list, don’t fret … You have another opportunity to claim the top spot next year. Stay tuned for those details, and make sure to register for our upcoming Global Partner Summit, April 30 – May 3 in Vegas, where we will unveil the next President’s Circle destination.

    We can’t wait to celebrate all of the winners in Scotland this June. It’s sure to be a trip of a lifetime!

    Again, congratulations!


    Update your feed preferences

                    submit to reddit    

  • Boosting Business Agility, Eliminating Bottlenecks with 25GbE
    EMC logo

    Networks are the modern data center’s foundation. Rapid information growth and evolving applications require real-time responses. This has led businesses of all sizes to build up their corporate data centers by including a growing number of modern compute and storage platforms. These modern data centers require high-performance interconnections between all of the various server and storage devices via a high-speed information superhighway, both within the rack and between the racks.

    Unfortunately, a bottleneck exists within the data center network infrastructure that prohibits organizations from reaching optimum application performance. That bottleneck is the 10GbE I/O connection from the server/storage devices to the top-of-rack (ToR) Ethernet switch (See Figure 1).

    This problem is exacerbated when there is an abundance of virtualization and cloud-based traffic occurring in real time. The rapid growth of server virtualization introduced significantly more east-west Virtual Machine (VM) network traffic between servers and the chassis to accommodate the real-time mapping of resources and led to a migration to 10GbE connections.

    However, along with increased use of server virtualization, many organizations are also implementing multi-processor servers, all-flash storage systems and cloud-based technologies that require even higher-speed I/O communications. These transformations are described in the IDC study “Accelerate Business Agility with Faster Server Refresh Cycles,” where businesses must keep infrastructures up to date in order to support the scale and complexity of a changing application landscape. So, now as companies consider and plan their server refreshes as part of IT Transformation, they need to consider upgrading their network – by upgrading to 25GbE.

    Illustration: 10GbE In-rack connectivity

    Figure 1 – 10GbE In-rack connectivity

    Based on the projected rapid growth of 25GbE adoption in a 2017 Ethernet Switch market update by the Dell’Oro group, server and storage vendors are now shipping 25GbE I/O on their platforms (See Figure 2). These new 25GbE I/Os utilize more efficient chip technology and provide 2.5x the bandwidth over existing 10GbE connections. That’s like taking an existing 4-lane highway and expanding it to a 10-lane highway.

    There are also some added benefits when using 25GbE including a more efficient use of valuable switching ASIC I/Os (1 instead of 4 used by 10GbE and 40GbE) and the ability not having to replace existing 10GbE fiber optic cabling when upgrading to 25GbE. This expanded 25GbE connectivity will greatly enhance traffic flows between devices in the rack as well as enhance traffic up into the data center fabric and to other racks connected to the fabric.

    graph: Data Center switching speeds (Dell-Oro Group 2017)

    Figure 2 – Data Center switching speeds (Dell-Oro Group 2017)

    There are a number of switch vendors in the market, including Dell EMC, who are shipping 25GbE switches to ensure their servers are operating at peak performance and achieve transformational SLAs. These new switches offer ports that can provide native 25GbE connections as well as multi-rate speeds that include 10/25/50/100GbE connections. The best news is that both enhancements in 25GbE chip technology and higher levels of customer demand are helping to bring the cost of 25GbE technology to almost the same price of existing 10GbE platforms.

    When going through a server refresh campaign within the data center, it now makes sense to take a closer look at incorporating 25GbE I/Os in those new servers. Also coordinate with the networking team to take a look at the ToR Ethernet switches within the racks and see if they can efficiently handle 25GbE switching connectivity within the rack. If not, it might be a good time to investigate the implementation of a new generation of 25GbE Ethernet switches that support IT Transformation initiatives and additional OPEX savings by using less power, requiring less cooling and taking up less space as compared to existing switches in the rack.

    When you’re making the switch over, remember that not all infrastructure is the same. It must be modernized. You don’t have to use the same vendor you’ve always used. In fact, being vendor indifferent is preferred for modern networking. Rather than just upgrading to yet another proprietary switching solution, take the time to investigate Dell EMC 25GbE open networking solutions that can provide you with open, standards-based technology disaggregation that embraces choice, flexibility and innovation within the network infrastructure. For example, when adopting an open networking solution with a choice of operating system (OS) software running on the switch hardware, Linux-savvy server administrators can take advantage of a Linux-based OS running on their switches and adopt a more common interface across both server and networking resources (see Figure 3). They will also be able to leverage the large talent pool available in Linux and open source standards-based communities for data center automation and orchestration tools that help accelerate rapid innovation and organizational agility.

    Illustration: Empowering Linux in the data center with 25GbE Open Networking

    Figure 3 – Empowering Linux in the data center with 25GbE Open Networking

    As applications and their respective server platforms continue to evolve, adoption of 25GbE into the data center networking infrastructure is a necessity for all organizations. Including a 25GbE upgrade to the network when going through a server refresh ensures the elimination of bottlenecks that rob modern infrastructures of their performance and agility gains. Let Dell EMC help transform your IT with modern data center 25GbE solutions.

    For additional information on Dell EMC Networking solutions, please refer to and


    Update your feed preferences

                    submit to reddit    

  • Dell EMC Flexes HCI Muscles with Next Generation VxRack FLEX
    EMC logo

    The numbers don’t lie. Sixty percent of Dell EMC VxRack FLEX customers make a second purchase within 180 days. And these are big customers – Fortune 500 companies that have resources, do their research and select proven and competitive solutions after going through rigorous bids and POCs. This tells us that our rack-scale hyper-converged infrastructure (HCI) system delivers on its promise and VxRack FLEX is on an upwards trajectory with no signs of slowing down.

    the outside of a silver high-rise building

    Launched in 2015, VxRack FLEX offers a turnkey experience that includes delivery and support of a single solution managed holistically. Under the covers, it includes world-class compute, storage and integrated networking for both virtualized and non-virtualized environments. With VxRack FLEX’s simplicity of deployment and ongoing management, an analyst report found that VxRack FLEX offers 6x faster time-to-value and 30 percent lower TCO compared to a traditional SAN.

    VxRack FLEX enables massive scale-out capabilities for the data center along with flexible deployment options (compute and storage residing on the same server or separated out). Add nodes one by one within a single rack or scale out with additional racks as compute and storage resources are consumed. This provides your infrastructure with elastic sizing and efficient scalability, allowing you to start small with your proof of concept and grow to web-scale size as your requirements evolve. Because VxRack FLEX supports diverse environments, it is ideal for consolidating both traditional and modern applications that demand high performance, availability, and resiliency onto a single system.

    There have been a number of important enhancements to VxRack FLEX already this year. Available this month, VxRack FLEX is integrated with the latest 14th Generation Dell EMC PowerEdge Servers. This next generation system means more powerful handling of workloads, greater capacity, and improved flexibility:

    • 5x more IOPS per node
    • 60 percent more flash capacity per node
    • 4x more memory
    • 34 percent more virtual machines per node
    • 250 percent more bandwidth

    Dell EMC PowerEdge 14th generation servers are designed specifically for and tailored to HCI workloads that depend on the tenets of both servers and storage. This enables us to offer customers enhanced storage capacity and flexibility, allowing customers to optimize their storage configurations for their hyper-converged environment. These servers enable VxRack FLEX to deliver significantly faster access to applications and data thanks to higher core counts, faster clock frequency, more memory channels, and faster memory.

    Also new this month are enhanced monitoring, alerting and reporting capabilities – a huge priority for our customers running mission-critical applications because it allows for greater agility and control of server resources. These new features mean failures are quickly identified and Dell EMC Support is informed immediately for speedy resolutions. Proactive alerting and automated technical support means less time is spent troubleshooting so more time can be spent addressing business priorities.

    If you’ll be at Dell Technologies World this year, be sure to join our breakout sessions to get a deeper dive on our architecture, use cases and workloads:

    If you can’t be there in person, check out some of our recent white papers:

    VxRack FLEX has had tremendous momentum coming out of 2017 and with these exciting new releases happening in Q1 and Q2 we expect nothing but acceleration and growth. Reach out to us at any time to discuss the best HCI approach to meet your needs!


    Update your feed preferences

                    submit to reddit    

  • The Importance of Robots, VR and IoT to Channel Partners in 2018
    EMC logo

    Last month we celebrated one year of the new and improved Dell EMC channel partner program. And what a year it was! We learned a lot during this time and I’m pleased to say we have listened to the feedback from our channel partners and customers and actioned it.

    Man wearing a Microsoft mixed reality headset at a Dell event

    We’ve kicked off 2018 on a high by announcing improvements that will continue to increase the benefits for our valued partners. With a new rebate structure and a competitive MDF strategy, we have shown our intention to always to reflect on our offering and continue to make the program simple, predictable and profitable for our partners. Locally, we hosted our first Partner Advisory Board of the year; it’s a great event where we drive truly meaningful conversations that allow us to continually invest and improve the program. But these ongoing changes are just one part of the strategy that helps our partners remain successful.

    At the end of 2017, Dell Technologies predicted 2018 to be the year that human-machine relationships reach new levels. So, what does this mean for our channel partners? Emerging technologies like artificial intelligence (AI), augmented reality (AR) and virtual reality (VR) will dominate the conversation. Just this week, an Australian school revealed it was using a robot to teach alongside a teacher. The AI capabilities of the robot provide a two-way experience that goes above and beyond a child using a mobile device. The potential for AI to disrupt all industries is here and we are about to jump in head first. It’s important to ensure that your business is not only aware of what can be achieved using the technology but also have the technical understanding of the infrastructure changes needed to create a modern data centre.

    Advances in the Internet of Things (IoT) and cloud computing are progressing faster than we anticipated. This extra processing and analytical power is already changing the way we live with more connected homes and cars, and greater consumer expectations in almost every industry.

    One of my favourite customer stories of last year is about Tassel and our partner Intuit Technologies. Using IoT to farm more data on how their salmon pens were performing seemed like a straightforward solution. By predicting multiple variables, the team was able to produce better outcomes for the business. But to run the IoT, Tassel needed to upgrade its IT systems, which is where Dell EMC came into the mix. We provided the hyper-converged infrastructure required to store, manage and automate all the extra data the IoT element was producing, allowing for a real-time decision-making process. This journey had two parts to success and we encourage our partners to become experts in both.

    As we continue to see these incredible use-cases and explore new ways of working with technology, our partners need to remain ahead of the curve. Immerse yourself in the possibilities that can be achieved so, when the time comes, you can help to bring these incredible use-cases to life.

    With Dell Technologies World and our annual Global Partner Summit taking place next month in Las Vegas, we encourage all our partners, resellers and customers to join us. In an action-packed agenda, we’ll explore the latest technology trends with our experts, hold workshops and training on our full product portfolio, and share great stories from our customers. It promises to be an inspiring week with lots of insightful discussions. What are you waiting for? Find out more information and register for the event here.


    Update your feed preferences

                    submit to reddit    

  • Teraflops and Engineering Your Way to Artificial Intelligence Scale
    EMC logo

    Often I talk to customers who are in a journey to implementing Artificial Intelligence (AI) in their organization. In my last blog, I wrote about some best practices around Deep Learning. In this blog, I am going to get more prescriptive about how you would and probably should explore “Scaling AI” inside your organization.

    To start, I am assuming that organizations have already developed a noteworthy model. This can be at inception Modelling or Natural Language Processing. You can deploy on GPUs or CPUS.  Stochastic Gradient Descent or Micro-Batches. You have trained your model and are ready for inference deployments. Or are you?

    Models – Day 2

    crashed trainWhat happens when you want to train more than one model at a time?  What happens if you are sourcing multiple inference models at the same time?

    Welcome to a few days into your production AI/Deep Learning environment and your environment starts hosting multiple models and Data Science teams at once.

    What is the best way for you to capture these statistics at scale? What components matter most when you Scale AI?

    Queueing Theory

    illustration of mathematical formulaQueueing theory is probably most eloquently articulated in the postulate around “Little’s Law”.  Little’s Law deals with the queueing of “arrivals” and response times.  In algebraic terms for infrastructures, process or file occupancy = latency x bandwidth.  Each process (be it GPU or CPU) to train or deploy a production model depends on how fast it can get serviced and completed.  Mathematically, the discussion of latency vs bandwidth has been debated for decades.  This is hindered by each implementation of a storage operating system with its locks and latches, network latency, storage bandwidth, network bandwidth, and compute parallelism that drives great arrival rates.  It has been argued that bandwidth is far greater than latency for servicing requests at scale.  Great bandwidth causes increased latency depending on the latches and locks on the underlying storage infrastructure.  As such, this becomes a puzzle that has customers often asking for help when they wrap their heads around Scaling AI.

    Other Factors

    There have been a number of studies looking at various elements of running analytics at scale.  Compression and Encoding algorithms for data files were the topic of one recent study at storage content at scale.  For most of my lab testing, we use Protocol Buffers (ProtoBuf) with LZO compression.  There are native hooks into the Deep Learning Frameworks (e.g., Tensorflow, Caffe2) to handle native ProtoBuf file serialization and streaming on read.  This allows for effective utilization of binary encoding and CPU optimized compression.

    When you are singly running training models or deploying models for inference, it is easy to tweak data to fit the whole model in cached memory.  What happens when this environment becomes multi-tenant?  A dev or data scientist can no longer guarantee that all model data be loaded into memory for training or inception.  This pushes down more concurrency and arrivals to the storage layer.  Caches, spill space, and software defined caches can certainly help.  At any given scale, this will probably still push a large amount of concurrent requests to the storage layer for servicing around inference or training models.

    Containers are all the rage.  Containerization of workloads certainly makes trouble shooting and deploying models and environments easier.  The complexity comes into the ability to load durable storage into the container.  Either all the data is loaded into each container, the container security model is breached to allow durable storage connections, or a container bridge is deployed that allows full fidelity security containers that can leverage file or object external systems for durable model data.  Our recommended approach would be a container bridge to maintain container security and allow external shared durable storage for model and training data.  This allows manageability and scale to be maximized.

    What does good look like?

    Let’s start to tie back some of the items raised above to drill into the specifics.

    Let’s assume you have a series of Inception and NLP models running at scale.  Based upon our internal analysis running several deep learning frameworks, storage profiles as Random IO (i.e., random reads) for Deep Learning.  If you assume a highly read centric workload when training or deploying inference models, they each have different level of parallelism.  Training can only go so embarrassingly parallel with respect to GPU size and how the mini-batch optimizers are organized.  Inference of trained models can deploy in very embarrassingly parallel configurations.  This brings different levels of dynamics when all are running at once in the same cluster.  If the training is isolated to one cluster of compute/storage and inference is serviced via another cluster, this will be easier to model mathematically.  If you have serialized your data, it will allow a manageable compression and optimization of file management.  A single serialized file will add more single file concurrency than managing a myriad of small files though.  The tradeoff occurs when a storage subsystem doesn’t degrade latency via little’s law with more concurrent connections and allows optimization of storage and management through serialization.  If the overarching factors start to sound like chess vs checkers, then you are starting to realize the complexity associated with Scaling AI.

    If you assume a compute farm worthy of embarrassingly parallel workloads, then you can start the math model around your environment.  Little’s Law is a good place to start.  How many concurrent models should we expect of what types (training vs inference)?     What types of architectures within the Neural Network on what sized images, text bits, or serialized files in what configurations (e.g., CNNs, RNNs)?  What is the expectation around turn time with regard to wall clock model completion times?  These factors can be leveraged into network capabilities modelling (10G vs 40G for example), storage chassis with concurrency and bandwidth metrics (1M concurrent reads per second at 15 GB/sec per 4u storage chassis).


    The net of the effort is an engineering math model that can be simulated to understand capabilities around the scaled AI environment.  This takes you a far cry from writing your first trained model to achieve business outcomes and curious how that will deploy at scale to service your production customers.  Questions you might want to slowly answer as you build out your Scaled AI environment:

    • What level of concurrency can my storage support before latency ticks up?
    • What is my container bridge strategy?
    • What level of performance am I expecting and or getting in my PoCs?
    • What is the scale of the durable storage for training vs inference?
    • How comfortable am I doing this alone or do I want to talk to something who does this for a living?


    Update your feed preferences

                    submit to reddit    

  • How Converged Systems Enable IT Transformation
    EMC logo

    Building Agility, Efficiency, and Simplicity into Infrastructure Is Fastest Way to Transform IT

    Just about every organization wishes they could make IT more agile. But a new survey of over 4,000 IT leaders conducted by Enterprise Strategy Group (ESG) indicates most organizations are finding the sheer weight of their disparate legacy systems slows their progress towards achieving that goal.

    Dell EmC VxBlock rack

    As we know, and as the survey makes clear, most organizations encounter tough challenges when modernizing their IT environments. Only six percent of the organizations surveyed have managed to fully align IT and the business. Compared to the same ESG survey a year ago, that represents just a total increase of one percent.

    Organizations are making some progress, just not at a rate that many of them are likely to find satisfactory. The ESG survey asks respondents to rank their transformation from Legacy (Stage 1) to Transformed (Stage 4). In between are Emerging (Stage 2) and Evolving (Stage 3). Year over year, the survey finds that 88 percent of respondents rate themselves as being at either Stage 2 or Stage 3. While at first glance that may seem like substantial progress, it only represents a collective five percent increase year over year. Stage 2 is up four percent, while Evolving was up only one percent. Clearly, there’s a major hurdle being encountered as organizations try to move beyond Stage 2, which ESG describes as being the early stages of adopting modern IT processes.

    Transforming IT is one of those classic chicken and the egg equations. It’s challenging to modernize legacy systems that were never designed to be automated. The good news is that modern converged and hyper-converged infrastructure embed automation benefits required to make IT environments agile enough to meet the dynamic demands of a truly digital business.

    bar chart: transformed organizations have completed more than 2x their recent IT projects under budget

    At Dell EMC, we’ve identified five foundational pillars of organization that form our converged and hyper-converged philosophy and yield the best outcomes for businesses on their IT transformation journey:

    Engineered as one: Pre-engineered and validated from the ground up.

    Manufactured as one: Dell EMC converged and hyper-converged systems are factory and logically built with exacting quality.

    Managed as one: Centralized management options for getting the most out of your infrastructure resources.

    Sustained as one: Backed by holistic lifecycle management, derisking updates and providing the utmost in stability.

    Supported as one: One-call support to save time, money, and resources.

    One of the best examples of those five pillars being implemented in a turnkey fashion is the Dell EMC VxBlock System 1000. Converged systems such as VxBlock 1000 enable businesses to flexibly offer the right mix of resources and data services, increase data center efficiency, and deliver a turnkey engineered system experience with lifecycle assurance. All these benefits transform IT and free the organization to focus on innovation.

    “Since implementing Dell EMC VxBlock and VxRail, we spend four to six hours per week managing systems, not up to three days,” says Aponte. “We can now focus more on how IT can provide value for the business, for example, helping engineers do more with operational technology or introducing automation,” said Corey Aponte, server administrator at Johnsonville Sausage.

    Since implementing Dell EMC VxBlock and VxRail, we spend four to six hours per week managing systems, not up to three days.

    Complementing the VxBlock System in the Dell EMC CI/HCI portfolio are Dell EMC hyper-converged appliances such as VxRail and XC Series, and hyper-converged systems like VxRack FLEX and VxRack SDDC. Each provides a transformational resource to simplify the complex status quo of IT, enabling organizations to become more agile and efficient.

    To learn more about how Dell EMC converged and hyper-converged infrastructure can help transform IT for your business, visit

    If you’re attending Dell Technologies World in Las Vegas this May, be sure to visit the Converged and Hyper-Converged product sections where you can take a look at the industry’s leading portfolio firsthand, play a CI trivia game, and speak with our CI/HCI experts.

    Read the ESG report here.


    Update your feed preferences

                    submit to reddit    

  • Poll
  • Login
  • Search

Does your organization use big data?

Follow CIO2CIO on: