Julia Open Source Development

About

In this episode of Open Source Directions we were joined by Jeff Bezanson and Katie Hyatt who talk about the work they have been doing with Julia. Julia is a programming language that was designed from the beginning for high performance. It programs compile to native code for multiple platforms via LLVM. Julia is dynamically typed, feels like a scripting language, and has good support for interactive use.

Julia has a rich language of descriptive datatypes, and type declarations can be used to clarify and solidify programs. This language uses multiple dispatch as a paradigm, making it easy to express many object-oriented and functional programming patterns. It provides asynchronous I/O, debugging, logging, profiling, a package manager, and more.

Transcript

[Music] hello the internet welcome to open source directions hosted
by open teams the business to business marketplace for all of your open source services and support needs open source directions
is the webinar that brings you all the news about your favorite open source projects my name is henry badry and i’m the
growth marketer here at open teams and i’ll be your host for this session uh joining me as the co-host is
tony oh
he’s muted i’ve unmuted you sorry tony wait
hi tony we’ve got you a mute and i can’t unmute you oh no sony okay jeff how about you
introduce yourself go ahead oh there we go chinese back
oh my god daddy are you there i am i’m gonna restart my computer and
come back okay no worries we will just get jeff jeff you want to introduce yourself and then we’ll get
yeah sure hi everyone i’m jeff bezanson uh i work on the julia programming language and i’m here in
cambridge massachusetts right now all right and then we’ve got katie hi
everyone um i’m katie hyatt i am a julia contributor since 2015. i
also worked on a bunch of julia packages and i’m based in new york city so apologies to advance
for the ambulance silence well thank you very much for introducing yourself well wait till tony oh we’ll
keep going and then tony will join us but hi mcgill hi mohammed before we get into the juicy details of julia the
an amazing programming language uh we’re going to give a few pull requests that both katie and jeff have been enjoying
this week so what pull requests have you found interesting uh jeff what about you first
uh yeah so you know the new apple hardware is upcoming and it’s something people are excited about
and a few people in the julia world are got are on top of it right away uh which
i thought was you know just pretty cool the response time there was was really good and a couple people ordered some of the developer kits and
have already started working on bringing everything up and i think i think it basically works already but there are a few a few
details uh a few rough edges but there’s yeah there’s i link to a pull request there uh how should i should i click it or do
you want to click it um well we’ve just got we’ve got brandon we just shared it in the chat so if anyone wants to have a look at it it’s in the
chat okay okay good yeah so that’s just one of the pr’s of getting the getting julia working on the new
apple arm hardware awesome and katie yeah so mine is pretty
similar so um nvidia i just released a new coder version and one of the packages i work on a lot
is kuda.jl um so the creator of that package tim bisard has a really nice system set up
where it automatically goes into the headers and creates wrappers but then you have to make sure that it actually did that correctly um which
could be a problem if for example the name of an enum changes um and a bunch of the tests broke and
a bunch of our existing rapper code was wrong because also the order of some arguments changed
digging in figuring out what was not working um and then getting things uh to work
again and the turnaround time was pretty quick considering kuda 11 only became public like a week or so ago
okay cool awesome well thank you very much for sharing uh we’re now going to get into the introduction section we’re just going to
give everyone a little bit of context make sure that you’re up to speed with what julia is uh so julia is a new programming
language that combines the productivity of languages like ah python or matlab with the performance of
c and fortran julia provides a sophisticated compiler distributed parallel execution numerical
accuracy and an extensive library of fast mathematical functions very nice it is being used by a number of
universities around the world for both teaching and research and in the area of business and
businesses using it uh there’s a diverse range of businesses that use it in the fields of engineering
finance e-commerce and that’s just to name a few so one incredible thing is the number of
users that are using julia and that is that is 10 million um that is just incredible and also
that’s the number of downloads i mean i wouldn’t say it’s the number of users that’s that’s kind of the only counter we have
there’s no way to know the number of users but we know the the download counter and that’s yeah okay awesome so that’s
it based off julia lang.org and docker hub so awesome and tony has joined us
back here is everything working okay tony i hope so do you want to quickly give
yours uh hi i’m tony fast i’m a data scientist
and open source consultant at quansite um i’m based out of atlanta and uh yeah i’m excited to be uh
doing this julia interview today awesome very very nice so julia’s been around
about eight years at this point i’m just i was just wondering how long we can still say it’s a new language
yeah i guess i guess they’re not that neat i like being right it’s good it’s good
to be new i you know it’s exciting but uh yeah i’m just wondering how long we can keep that up
i mean is it like dog years where where each one is like seven years
yeah i mean it’s much more it’s much newer than most languages people use so it’s it’s true but
yeah awesome okay so why was this project started what needs to feel um i guess uh i can go first a little
bit um just uh so i did a phd in computational physics i’ve been doing
scientific computing for a long time and in that community it’s really common to have people doing their prototyping and data
visualization in python and then they do their actual production code writing and see c plus plus there’s no problems fortran
um so c plus plus and fortran are great languages and so is python i don’t
want to make it sound like i think any of them is bad but the situation where you have to know two different languages with pretty
different internal structures it’s not necessarily great if what you want to do is write effective scientific code
um because first you have to keep track of two code bases um and then also if you want to
communicate your code to other people whether that’s students you’re trying to train in your group or other scientists who
maybe aren’t as experienced with programming it’s really difficult um because although for example suppose plus is a
very powerful language it’s probably not the easiest thing to learn um and uh
so the other thing that’s kind of not great the scientific programming community is
that we don’t really have any sort of systematic systematized way of imparting software engineering knowledge so it’s really common to find scientific
code with no unit tests no docs um if there are any kind of tests that often years out of date
and the situation is really not very good and so for me the reason i switched to
julia was that i was able to write both my prototyping and production code in one language i
didn’t really have to modify it too much to be able to scale up to like many different distributed nodes or people
use the gpu for example um and julia promotes writing tests and
docs from the get go it’s very easy like test support is built into the standard library and
um so you don’t have to install a bunch of extra packages to write unit tests similarly it’s really easy to measure
code coverage um and julia has really good first class support for
both like single mode threaded parallelism and distributed parallelism and linear algebra which was really important
for me awesome yeah so that’s uh
that definitely describes the motivation uh that’s yeah that’s just the kind of thing we were
thinking about when we uh set out to kind of solve uh this kind of problem
uh but you know what what it comes down to sort of at the programming language level is kind of a
tension between flexibility and efficiency uh and that’s that’s something that’s very difficult
to get in fact it’s really still an ongoing you know research topic in programming languages no one really has like the perfect
uh solution to it yet uh but you know we we wanted to kind of improve uh the
combination of those that you could get at least for some cases uh so you know like for instance and one
example i like is like putting physical units in a program like being able to label things as meters and seconds or something
and that’s it’s something that’s come up a lot over time because it’s kind of an obvious thing to do you know and there have been famous bugs
uh that happened because people you know got confused units in their programs uh so so people wanted to do that but
and they’ve been many ideas about how to put that into languages but it’s kind of never really happened right like how many
you know how many programs have any of us seen in the real world that actually have units on things right it doesn’t it
never seemed to really take off but in julia it actually works really really well so you can we have a there’s
a units library and it just sort of composes with everything else very easily so you can you can just make a matrix of unit
quantities and just do linear algebra with it and it just sort of works right away and there’s there’s very
little overhead sometimes there’s no overhead in fact to doing that yet people have plugged units into
they use them a lot with images uh libraries and you can put them into differential equation solvers for
example and everything just goes through and it just works so it’s kind of uh you know i don’t know how many people
use it but it definitely works for real this time
that’s super awesome uh i guess what you’re saying to me is that my uh comments about the unit that my variable is they
don’t count i’ll take comments i’ll never say no to a comment
well uh thank you uh i was i was hoping that uh y’all could explain kind of the history behind the name and the uh yeah everyone
wants to know where the name is from understandably uh there’s there’s not really a good answer it just
kind of uh just kind of sounds nice uh it has some nice connotations like the julia
set uh you know gaston julia uh we we also like to make up stories we
you know stefan likes to you know every time someone asks he tries to think of a different story
like uh he said it’s uh he once told someone it’s his middle name stefan’s middle name uh or it’s uh i
also like julia child is a good connection she’s also you know cooking up some programming languages in
in cambridge uh so there’s not a really good answer
to the uh the name uh the the logo the logo evolved
out of the ascii art kind of uh text banner uh in the repple that’s actually one of the first things we did we i was
like okay i need i know we need an interactive prompt for this so we made the we just made the ripple at first
actually and it just you know you just type something in and it just typed it back to you because there was there was no language but that was that was the first
thing we we wrote uh and so it was it was this ascii banner that you know i kind of embellished a little bit
and i can’t really do very much with that i’m not uh not a visual artist at all so i just sort of added some more
dots and some more colors that’s about all i could handle um and stefan is like oh that doesn’t look too
bad but we need a you know a proper like graphical version of this it can’t just be ascii so he made a svg
version of it where he picked a typeface and all of that and sort of turned it into more of a logo
so it kind of evolved out of very unskillful ascii art and filtered through stefan you know
making it making an svg i love that story i think that it seems
to be like quite the common thing is these people just doing themselves and i love it like there’s some crazy funky logos out there but
um what a different note what are some of the alternative projects that are out there
yeah so like in science like i said earlier lots of people are using c plus plus fortran for tran and python libraries matlab is
really common too especially in experimental science because they have a really good suitable libraries for interfacing with your measurement
devices or for doing pde simulations for example um astro pi is a really big
suite of really great software that probably the name implies astronomers use to do a lot of different simulations or
analyze a lot of different astronomical observation data um julia’s also being used a lot for
machine learning actually and i’m sure everybody listening to this is aware of the existence of tensorflow
and keras so yeah for that
i guess follow it following up there um what technology uh is julia built on
so a really big one is lvm the the low-level virtual machine which is kind of a
common compiler infrastructure the idea is you know they implement back ends for every hardware
architecture and languages all target lvm and then you get all those architectures for free you
get a bunch of reusable optimizations uh and it started at the university of illinois
uh i think not too long before we started working on julia around the same time actually maybe a
couple years before we started i believe so it was kind of an exciting up and
coming thing uh and well it still is i guess and it’s so it’s it’s used in uh a lot of apples tool chain so it’s
it’s used for apple’s uh default compilers and it’s used for in a swift compiler
um so it’s it’s increasingly used and it’s that you know we looked around for something to use uh
to do native code generation and we wanted to be able to compile in memory because we knew we needed a
just-in-time compiler and that just really seemed to be by far the best thing it has a really good
thorough api it has lots and lots of optimizations implemented so i think it’s just you know if you want
uh something like that to target as a compiler it’s it’s really just you know kind of far and away the best option and it’s uh it’s
it’s generally yeah really really good it’s a great it’s a great project so that was a like a key key strategic thing to to
kind of pick as a as it built as a basis uh technology uh and we also use libyuv
which is the async io library that node.js uses um so that’s that’s another another
piece we have in there okay awesome well it’s what it’s a fantastic project and it’s so popular 10
million downloads is incredible and the fact that apple uses a lot of its devices that that is again just
just incredible but it obviously starts well uses lvm i don’t know okay
so back to the origins of it though um who actually started the project uh so
it was started by uh myself and uh stefan karpinsky ralshaw and
alan edelman uh i guess what happened was sort of viral was the person initially who knew
all of us and we didn’t know each other uh and he had kind of separately uh talked to us
and you know we about well i had and stefan had also at different times told him about you know ideas we had for
programming languages and then at some point he kind of said wait a minute you guys say a lot of the same things you guys should talk to each
other so he sort of introduced all of us um and it just it just sort of clicked right away we we
you know we we started this furious email thread that was just like went went on and on
you know uh in the yeah and so he and uh so we decided we wanted to to do something like this um just just
at that time viral and i were working at a startup company that was acquired uh and so we were kind of at a
transition point and we wanted something else to do uh so we thought about how we could you
know really work work on this and make this happen uh and alan allen edelman was able to hire me at mit
to to work on it full-time so it could get off the ground and that’s and that’s how it started and stefan and
for all contributed remotely and then i worked it out
that is a super cool origin story um so uh that leads me to ask who’s
maintaining the project are all the original uh people who created it still maintaining
um and how is maintenance uh ongoing yeah every everyone is still uh still
working on it we’re all still working on it very much uh now the the number of contributors uh to the julia
language uh core repo on github itself just recently crossed a thousand wow
so there’s a lot of a lot of contributors now and we have people from all over the
world too and not just academic people um like people who are in industry people who are students
um people who are just interested in the language for whatever reason
it’s kind of cool to log on and see that lots of commits were made at 3am my time and i know it’s not just people who have
a weird sleep schedule and we also have people who are coming from like many different languages
i’m in the sense of like human languages not programming languages so there’s many different international
communities that maintainers are from which is really great and we also now have a bunch of people who are employed by
julia computing itself which of course is also awesome because they can spend all their time working on julia and making the tooling around it
better um that’s awesome that’s that’s fantastic i know you briefly touched on it but it’s just like a really important
point and something i’m really curious about is finding out more and just diving a bit more deeper into
what kind of communities um both the users and the contributors come from
um so at least from my perspective there’s a lot of variety one thing that’s really cool is that
communities sometimes that wouldn’t necessarily talk to each other and i’m interacting because of julia for example like very early on
after i started like late 2015 or early 2016 i remember um i was talking to somebody
who’s a neuroscientist and they were like this is really weird i have this package and all of a sudden these radio astronomers
are just making all these pull requests against my package i don’t really understand why but it’s cool i don’t know anything about radio
astronomy and it turned out that like somehow there was some kind of like signal filtering algorithm that both of
them used that neither one like another group of people had any idea the other one was using so this guy
ended up talking to all these radio astronomers and his work ended up benefiting them because they were like okay we’re just
going to make this signal processor package better we’re going to use it to look at stars and it’s
pretty sweet because there’s lots of that kind of cross pollination also between people who are for example working in finance and then
people who are working in applied math to simulate bridges or something so there’s people coming from many
different fields of science and engineering and industry and one
thing i really like about the feeling community is that um people are really
smart but they’re also willing to not be the smartest person in the room um like people are willing to learn from
each other which i think is really important having a healthy open source community oh that’s super that that cross
pollination super inspiring i feel like in the hollowed all halls of academia that stuff doesn’t
those people wouldn’t cross paths very often there’s i think we’ve noticed a really
a really noticeable effect in keeping everything in the same high-level language
and also having a lot of reusability and composability in the language like really it has a
a social effect at you know the way it gets people kind of working together and on the same page like this that’s really
been one of the most gratifying aspects of the whole project so it seems like your
software uh can affect a lot of different people in different languages whether formal languages or programming languages
are there any uh diversity inclusion efforts that help you guys bolster uh these kinds of contributions and
community uh yeah definitely uh i know i i know
katie you can probably say something about that i know i know we’ve done lots of uh we have uh on the non-profit side we
have often employed like a community manager who’s in charge of those kinds of efforts
uh and we do a lot of outreach and uh connect to like campus groups uh and do do workshops um and there’s a
whole uh there’s a diversity committee in julia conn which is in a big part of the efforts and i think kke can probably say something
about that yeah um yeah so we have a we do have a diversity community dedicated to julia khan another thing
that i think is really important to mention is that when talks are reviewed for julia khan it’s done in a double-blind way so you don’t see the name
attached to the talk um this has been shown to be actually really important for having
um diverse speakers at a conference one thing we also try and do is encourage people who wouldn’t
necessarily feel that they’re like an expert on something to give a talk um this is also actually
a really important way to encourage a more diverse speaker panel of course there’s many people who
um are experts in something and would give great talk but it is often the case that somebody who would be a really great speaker doesn’t
feel that they’re able to give a good talk because they don’t consider themselves an expert in something
and of course juliacon only happens once a year and we’re still around for the other 360 something days of the
year so we do have uh
sort of channels for people to discuss diversity issues in the community
um there’s an active effort during the rest of the year to consider how our actions might affect
uh diversity and inclusion and equity in the julian community um it’s definitely something where we’re continuing to work on
um i don’t think we want to rest in our world such as they are right now um it’s definitely a process so
hopefully we’ll be able to continue improving in the future that’s awesome
and that’s great with a thousand contributors to and growing it’s good to see you’ve got a structured approach uh to diversity inclusion because it is
a prevalent problem in the world but also particularly in open source um i just want to quickly remind everyone
please answer ask any questions in the chat uh there’s a little questions tab and
we’ll then be getting to those questions later on so just go to the right and pick out the questions tab and ask away
but now we’re actually going to shift gears into what we call the project demo section uh we’re going to be seeing
some of the cool features of julia and how it works so while jeff and katie are getting set up
we’d like to take this opportunity to thank our sponsor quonsite for sponsoring this episode of open source
directions kwonside creating value from data so when you’re ready jeff and katie take
it away uh so i can do a little demo of some multi-threading stuff
i think
okay also all right let me do a screen share let’s see how do i do
it just little actions and then enable screen sharing there we go
here we go okay so there is my juno ide
is that the ascii art uh yes that’s the that’s the banner down there that’s that that predated the logo actually that the
that yeah that banner came first and then the the logo is based on that lovely uh so i have here a
standard uh mandelbrot set generator but multi-threaded version and so this is you know this is an
example everyone does it’s a little bit old hat but actually it’s it’s kind of cool in the context of multi-threading um
so what i have here so here here’s the function that just does a double loop to fill in an image with all of the
escape time values and what i’ve just added here is on the inner loop which
loops over the columns over the y values i just put an at spawn and so that just means you know take
this take this loop that’s going to compute one column and just spawn that to run on some thread
at some point and then outside of that i say sync which means to just wait for all of those to finish uh
and then we can return the result and so this is a this is this new kind of spawn
weight parallelism uh that’s sort of similar to the silk programming model if anyone’s heard of
that or it’s also a little bit similar to go routines in go so it’s a really nice
composable kind of elegant way to do multi-threading and it works really well in this case
because the you know the thing about the mandelbrot set is it’s very very unbalanced because all of the each of the points can take very very
different amounts of time to compute so it’s kind of very uneven all right so
like like one thing that doesn’t work very well is just to like divide the image up into some blocks and compute those in parallel
because then the work can be very unbalanced so here i’m doing it very very fine grained each each column is a separate parallel
task and so actually right now in the settings i have i have one thread
here so this is going to be the single threaded version but i can just time that
it took about two seconds and maybe i’ll try it again
around two seconds and just to not that it’s directly relevant but we can also
look at the result so the way i did that was i just did
some scaling and then i have this gray function from the images package that basically just
converts a number into a gray tone and then so this is just an array of gray values actually if
we if we open this up and look at it this shows kind of i think it’s really cool how images work in julia here because we can
see it’s actually just an array of gray values so it’s just a normal 2d array but each element is basically like a
little object here that represents a gray pixel and of course there are lots of other color types there’s an rgb one there’s
rgb with alpha you know and there’s tons and tons of other color space types uh of course so sort of so we know that
this is basically an array of pixels not just an array of numbers so that when you try to display it you
get an image uh but now let me try to increase the number of threads
uh so i have four cores on here also i’ll put that up to four but uh you know that the thing that
always happens here is you know this is hard to demo on a video call because the the video call itself
uses a lot of compute power so not all four cores are going to be completely available but i’m
i’m game to try it and see uh see what happens uh so right now you actually have to restart to change the number of
threads i think we might improve that in the future but for now it’s set on startup so i do have to
restart it to get the different number of threads
i hope it’s faster
all right so then it took one second uh so there’s a so for four cores with
uh with a video call going i get a 2x speedup which is not bad so i if you’re willing to believe me i
tried this without the uh without the video running and it went from 1.6 seconds to 0.4 seconds
so it was pretty much a 4x speed up but uh with the with the video i only get 2x
but there is a speed up there
it’s awesome so yeah you can do totally general multi-threading in julia and this is a
language feature yeah yeah so this works with the whole language it works with everything it’s
integrated with io as well so you can do you know if you have to wait for an i o event in one of these it will it
will get descheduled and you know everything everything works are the color uh types also language
features too or that are those are all in libraries that’s all in external libraries cool yeah it does not
have to be built in awesome thank you very much for sharing
that katie do you want to share anything or we move on um i have a short gpu demo i can do if people
are interested uh let me just yeah sounds great um you guys are gonna
watch me live code which is always a little exciting
okay the first trick is to find where my window subsystem for linux is
there it is hello whoa all right step one complete um
yeah so i’m on a system that has a bunch of gpus um so i’m just gonna launch julia
um i’m gonna activate the cuda directory um so what this is gonna do is just say
i wanna use the specific environment i’m in it’s a little bit like python virtual and um don’t take that analogy too far um
and what we’re gonna do is use cuda and a package i really like called unicode plots which is what it sounds
like we’re gonna make some actually not that exciting plots in the terminal um so we’re gonna wait a second
while this um while some code is loaded
and basically the plan is that we’re going to make a random matrix on the gpu we’re going
to find its singular values and we’re going to plot those singular values so if i want to make a random matrix a i
can just say i want cuda.ran and this is going to
actually dispatch through the nvidia kurand library which is their gpu-enabled random number generator
uh let’s say i want to make this like 5000
typing is hard and then the semicolon here just means it won’t try and print it
um wait a second while this generates um the first time this gets round it’s a
little slow but it should be quicker i’m going to make a second matrix that’s going to be random normal this we did
rather than random um i’m gonna give it the inspired name b rather than a um
that’s also surprisingly so annoying um got the video running too very oh yeah
that’s true yeah yeah that might be why um so then what we’re going to do is we’re
going to compute the singular values of these and what’s really nice about julia is um if you’ve ever written a cuda c code you
know that it’s really powerful but it’s also extremely reverse uh so the fact that we were able to
write the wrappers in julia means that we’re able to um just directly call uh
julia’s linear algebra methods and have them dispatched through to the gpu because juliet run time can detect oh
this is a coup array so it lives on the cuda device and i need to call the cuda svd values function um which is
just like their implementation of what laypax gesvd so if i call svdsa is equal to
spd val actually the first thing i’m going to do is remember to turn off allowing scalar
indexing which can make pretty code really slow um and then we’re going to do svd vowels
a is equal to 2d vowels
oh right forgot to use linear algebra
this is going to run
and then i will make speedy vowels be um so i can just call this and it
immediately knows like to do everything on the gpu there’s no memory transfer between host device right now this is done now i want to
block these guys um so i can do that using unicode plots um so i can say
plot is equal to scatter plot um there’s going to be 5 000 singular values because these guys are 5
000 by 5 000 arrays um i collect the singular values to move them back to the cpu
and i give this series a name and it will be singular
okay and there’s a plot um it’s kind of an
ugly looking plot actually and the reason for that is that i’m trying to do this in windows subsystem for linux which is kind of sketchy
unicode support so i’ll just say please only use um ascii
characters this should look a little bit better yeah look at that this isn’t that exciting there’s only one dominant singular value which is not
surprising for this random matrix and then we can put another plot on top
again one to five thousand um moving back to the cpu memory svg values
of b the name is going to be singular values
b and this this bang character here means
i’m going to be modifying the plt object i generated earlier
um so if i plot this you can now see there’s more singular values um and they’re even labeled and this is
all in unicode and the terminal so this is actually very useful for me um if i’m working remotely it’ll give an
ssh into another cluster or something which may or may not actually even have x installed
uh then i can make some simple exploratory plots in my terminal um
just in an ascii unit code and i’m able to just write some gpu functions without
having to worry about passing pointers around which is pretty nice
i yeah so melissa in the chat says that um she likes reading the bang is streaming the command it’s like
very exciting that does make your day more fun so that
that bang character is it’s not required it’s just usually a signal to the programmer that like hey this
function is modifying be careful only kurugi question marks
got a plot yeah actually scatter plot and then if you get like one of the enterobangs
uh who knows what that does yeah so we have we have these various gpu functions but
also the gpu stuff in julia can actually generate gpu code so you can
write like normal looking julia code that loops over values and stuff and it couldn’t compile that to gpu code
so we have both the both the like prefab you know array libraries as well as you
know compiling custom code to the gpu awesome yeah thank you very much for
sharing that it was great to um to see that and now we’re going to actually jump onto the roadmap section
where we’re going to be discussing discussing um it’s a fun little segment of this webinar and we’ll talk about where what
julia basically where julia’s heading um what kind of future directions it’s taking so
jeff and katie can you tell us a little bit more about yeah what directions julia is taking
yeah so uh if you you might if you were looking really closely you’d see in katie’s terminal it said version 1.5 rc
one uh so that’s release candidate one and uh we’re going to be through that process very soon so version 1.5 is going to be
released soon uh it has a it has a lot of stuff either there’s going to be a blog post detailing it but there’s really a lot of
a lot of improvements um we’re so one one thing that’s exciting
uh as we’re moving to a new package and artifact server uh so we we’ve been moving towards
making kind of julia environments what we call totally reproducible so like right now you can send someone a
manifest file and if you just if you as long as you have that file you can just exactly reproduce their environment with
the exact same versions of everything and we’re also now extending that to data and binary artifacts
uh in a persistent server so it has has the property that anything you downloaded once you’ll be
able to download again in the future so nothing ever changes or goes away uh behind your back so anything you did
before you can do again somewhere else so it’s very cool and we have this binary dependency system that sort of
just works like magic we can just serve out binaries you know for any uh platform we’d support and they’re all pre-built
and it’s yeah it’s really neat and so we’re more of that is coming online uh in 1.5 um
uh as well as a whole bunch of performance improvements uh one thing you might have noticed uh during the demos is with the
the first time you do something in julia there’s kind of like a little delay uh pause where it’s compiling things so
it compiles everything on demand so these kind of compilation pauses the first time that’s very annoying and
sometimes they can be pretty long sometimes it’s just a second or two sometimes it’s like 10 seconds which is really bad um
so we’ve been working on that a lot actually and 1.5 is a lot better than 1.4 in that regard and actually i can tell
you 1.6 is going to be even better than 1.5 because we’ve been focusing on that a lot um
yeah multi-threading is uh is always improving uh we’re gradually working on reducing
the overhead of scheduling overhead and all that kind of thing uh you know squeezing out all the little
long tail bugs um so it’s getting it’s starting to be used very widely so that there’s like a
csv parser in julia that’s 100 julia and it’s multi-threaded and it is now for for some for some
cases there are a lot of cases of course but for some cases it’s it’s pretty much the fastest thing out there
and very competitive with the other csv parsers um kind of just down the pipeline from
that we have the data frames package uh which has generally not had great performance it’s it’s not it’s not
that competitive with uh the more mature uh you know data science uh data table
implementations like like pandas and rs uh data frames uh but we are now getting there so
that’s that’s uh one of the focuses uh the people are people are now seriously
working on the performance of the julia data frames and it’s starting to get get competitive uh and we’re we’re still
working on that and hopefully we’ll you know come at least come to par with the other stuff out there
um another big focus is differentiable programming uh automatic differentiation so we have
this uh we have this package zygote uh in julia that’s pretty pretty much works completely so it’s a
really really powerful auto diff system and you know we we do everything with like generality that’s
our big thing so it has to work with absolutely everything uh you can differentiate arbitrary programs
uh someone did a ray tracer for example so there’s a differentiable ray tracer uh in written in julia so you can it’s
like so you can play with you know reverse rendering uh kinds of problems so that’s that all is really cool
uh but it has some limitations so i think that there’s some performance issues sometimes if you do higher order derivatives i think uh so
there are so uh that doesn’t work that well yet and we’re working on that so that’s going to be fixed i think
fairly soon and then also there’s flux.jl which is
an ml framework a machine learning framework and we’re also sort of constantly working on uh
improving the performance of that and you know bringing that up to par or better than you know all the competing things out there it’s like the
you know the machine learning uh the competition is fierce out there so it’s uh that’s something we’re always
working on awesome okay did you anything share
about the roadmap or anything you’ve been working on yeah i mean so obviously we’re still working on gpu stuff one big goal for
that is to get single known multi gpu working well this is something that’s really important in hpc but also for machine learning for
example flex currently does not support multi-gpu training which is a problem because pi torch does have
that functionality and it’s just important to be able to do large data sets um one thing i think is really positive in general about julia
is we now have more people who are just using the language to build things rather than becoming contributors which is good because it means you don’t
have to become a contributor and be constantly fixing bugs to use the language but you don’t actually have to care
about how get rebase works to be able to do cool things which i think is good
but if you are interested in get rebasing you can definitely keep doing that
okay cool yeah yeah very much for showing that’s exciting to hear where the project’s heading
we’ll now answer a few of the questions that were asked throughout this webinar so i thought we’d kick it off with
miguel uh raz’s connect uh question uh he has asked jeff
what do you think is the coolest project that should be built on julia but hasn’t yet
that’s that’s a tough one but i think i’m gonna go with operating system
so it’s you know julia is definitely not designed for writing an operating system it’s uh it’s not so it would be it would
be quite a bit of uh work and kind of rethinking to sort of uh reorient it orient it uh towards that but i think a
julia operating system would be would be really interesting basically entirely you know uh all of the isolation would be
essentially like you know language and type system uh enforced uh i have some ideas about
that but yeah i think that i think that would be very cool but that’s a that’s a really hard long-term project but he said coolest so this is a
superlative oh tony i think we’ve got you muted
there you go thank you yeah uh melissa uh has a
question that’s not about the coolest but uh you know what packages are high priority right now um
in increasing adoption of julia like outside of the academic settings um i think for for us the biggest most
important ones are definitely the machine learning stack and also data science um that’s what people
i get the strong impression people from non-academic settings are coming to julia because they hear it’s really good for these things
um but there’s definitely still some rough edges to all these packages um that need to be swiped out one thing
that uh we definitely need more of them would actually be really great for new contributors to get involved in is writing tutorials um so like flex has
a model zoo where they implement some you know famous machine learning papers and stuff like that that’s
something that for newbies can be great to get involved in um or writing cool uh sort of like towards data science
like articles using the julia data science stack awesome of course you got melissa stoked
yeah she said yeah we need more documentation everywhere um
so another question from miguel catherine how have you found working with gpus in a distributed setting on
julia um that’s a really good question uh so i think the best way to say it is that it’s not
any worse than doing it in another language so basically there’s a couple ways you can do this there’s like cuda plus mpi
um and then there’s nvidia’s and ccl um so if you don’t know what all these acronyms mean don’t worry it’s not that
important but basically um the idea is to split up the problem
among many gpus which are themselves split up among many nodes and then you have to have some way for the nodes to talk to each other if
they’re doing it through mpi everything has to go through the cpu memory first so the gpu first offloads this memory to
the through the pcie bus and then that goes through infiniband or whatever or there’s now nccl where the gpus
communicate directly without involving the pcie pcie bus which is good because that bus is really slow
um but in terms of just like how easy it is to write code that does any of this unless your code is like
the world’s most obvious mapreduce it’s not that easy um which is not that surprising i mean
distributed programming is really hard if it was easy everybody would be doing
it um
so i think there’s definitely a lot of opportunity for julia to make this easier um and that’s definitely something i’m
interested in pursuing in the next couple years trying to make this more accessible to people
well cool thank you so much for that um it looks like uh miguel that jeff has shared the code uh
from the demo um so we’ll follow up with miguel’s other question which is
um how well do you think julia’s multi-threading can handle lightweight network i o versus goes
multi-threading so we haven’t done a lot of thorough benchmarking of that kind of thing so
i’m not totally sure i know you know go has obviously put a huge amount of effort into that because
that is their you know number one use case so they are probably benchmarking that and fine-tuning that all day every day
so it would be hard to believe we did better than them um so uh and and i know also al go has
you know a huge library of network protocols implemented and all of that which we don’t have all of you know we have http
but they have everything uh and but but i think you know from a very
high level i think it’s it’s roughly as good like there’s probably not you know massive differences but i’m you know i’m sure
they’re they’re probably better at that but i think it’s like roughly as good we do this like our our package server
is written in julia for instance using you know using our async io and stuff so you know it works um but yeah i don’t
have thorough benchmarks but i suspect it’s like roughly okay awesome yeah thank you very much for
sharing and thanks everyone for your questions we do appreciate that we’re now going to move on to our world
famous rant and rave section where each one of us is going to get a 15 second soap box to either rant or rave about
whatever topic they desire so i think jeff will start off with you well in 15 seconds is not enough i mean
come on but uh you know i i don’t know there’s just
there’s a lot of stuff that doesn’t work you know there’s there’s so much stuff that’s supposed to work but just doesn’t work
uh you know like they’re they’re things we’ve we’ve never gotten working like uh we you know we try to intercept stack
overflows and give a nice error instead of uh instead of just crashing and we have
that working fine on linux and windows and freebsd and we just we cannot get it working on mac os
it just no matter what it is always said faults maybe sometimes it works but not very
often and we just we just can’t it doesn’t work you know there’s a signal you’re supposed to be able to handle and whatever and we’ve
tried all of that but it just doesn’t work and you know and there’s just so much stuff like that
there’s just so much stuff that doesn’t work i don’t even know where to start
um katie what about you uh yeah so um i think all my responses have been
gpu based you guys can see i’m like clearly obsessed but my big one is i’ve been wanting to use cuda from windows subsystem for
linux for years because um this is actually also a two-part rant because
the as far as i’m aware the build process for julia on windows it’s like still horrendous um it involves mid-gw uh that’s like
probably all anybody needs to hear um and yeah i don’t like the windows development environment so that’s cool
um the nice thing is that with wsl2 it’s finally gonna be possible so it wasn’t possible before to do this
because uh basically wsl would not play nicely with the graphics driver um and that’s gonna change i know
shocking linux not playing nicely with proprietary graphics drivers you could have expected um so uh now that it’s finally happening
um i will be able to seamlessly switch from playing crusader kings to to developing rappers very nice very
nice and tiny how about you it’s your rant or rave well katie might be the coolest person i’ve met in a very long time
but um i’m a little bit like everything sucks and everything’s broken
um but there’s some really good stuff in there you know and right now it’s tough time
and especially in this open source world being positive over negative and inclusive over exclusive is uh
really going to help us get through a lot of this work yeah yeah and the and the flip side of that is it’s really impressive how much stuff
everyone is able to do and get working in spite of everything not working so that that’s promising that’s why i have faith in
humanity we’re going to keep you good well mine’s actually a rant and it’s not
not a great topic but i’m not sure how many of you know about this but australia was looking good in terms of coronavirus like i think
i was looking it was like 13 cases a day six cases a day at one point uh and then in victoria there were a few
security guards working in that so essentially the government would pay for you to quarantine in a hotel for two weeks if you arrived from overseas
and security guards couldn’t keep it in their pants and we’re sleeping with people who coming in from overseas who
had coronavirus and now it’s spread all over australia and we’ve gone going into a second wave because of that
and it just it kills me and it’s just that’s that’s extraordinary one job
and now everyone wants to be more like america everyone yeah we’re just trying to catch up all right
well thanks everyone uh for watching and thank you very much for joining us jeff and katie and also tony
thank you for joining um that’s all we have time for today sadly but um you can find us on twitter
at open teams inc and at quonsite ai jeff and katie where can
people find you and and more about info about julia
uh well check out julia lang.org that’s the website um lots of links there uh i am just my
first name last name jeff bezanson uh on github and the julia slack uh
julia discourse forum yeah i’m on github as at tess hyatt um on twitter
as at k slimes and i’m also on the julia’s song a lot um when quarantine started i was live
coding a lot on twitch i kind of fell off the wagon on that but i want to get back into it so check out twitch.com caselines
for hopefully soon to be revived truly alive going awesome well yeah thank you very much
again if you like what you saw today then please go to our youtube channel and like subscribe to see more of this kind of
content uh we really do appreciate it and it really does help us so we look forward to for all of you
joining us next episode uh you’ll be joining us in uh a lesson on nb greater
so thanks very much everyone thank you thank you all bye
[Music]